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Theoretical and empirical evaluations were also made of the effects of 
guessirig on the dimensionality of test data. The resulW indicated that • 
guessing affected highly discriminating items more "so than poorly discrimi- 
nating items. However, the effect of guessing on the dimensionality of 
^tests with common distributions of difficulty and discrimination indices 
was found to be minimal. Of the procedures evaluated foV sorting items 
into unidimensional item sets, principal factor analysis of phi coefficients 
gave the best results overall. Nonmetric multidimensional scaling also 
showed promise when used v/ith Yule's Y, phi, or tetrachoric similarity 
coefficients, but it did not 4)erform as well as the factor analytic tech- 
niques on the real test data. In summary, guessing does have an effect 
on test data, but the effect is not very large unless ijtems of extreme 
difficulty are present in the test. Of the procedures evaluated, tradi- 
tional factor analytic techniques gave the most, useful ?nfofmation for 
sorting test items into homogeneous sets. 
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The Formation of Homogeneous Item Sets 
When Guessing. is a Factor in Item Responses 



One of the fundamental assumptions of most latent trait models is that 
the.Uems in the pool of interest measure a single latent trait (Lord and 
Novick, 1968). Although some item pools do approximate the conditions speci-^ 
fied by this assumption (e.g., vocabulary, aVithmetic computatiqn, digit span, 
etc.), in many caTses item pools' do not automatically fulfill the requirements 
of a one-dimensional latent space. For example, most achievement tests de- 
signed using a table of specifications are not unidimensiorval . Further, it 
is questionable whether some criterion-referenced test i tem ^domains measure 
a single dimension. Therefore, some procedure is needed to form unidimensional 
item sets for use with latei^t trait models. 

Unfortunately, the procedures commonly used to form item sets that are 
homogeneous in the ability measured have been criticized because*of some basic 
inadequacies. Most of these criticisms stem from the use of items that are 
dicholomously scored. Factor analysis, for example, was derived for use with 
continuous variables. Since its basic model reproduces the observed score 
*from a linear combination of continuous variables, there is no way that dicho- 
tomous responses- can be adequately modeled. A symptom of this problem is the 
difficulty factors obtained when phi CQeffic^ts are factor analyzed. In an 
attempt to alleviate this problem, tetrachonc correlations are ofterr used in. 
place of phi coefficients. However, thes^correlations may not yield correla- 
tion matrices' that fiave the appropriate properties for factor analysis (iv.e,. , 
positive semidefinite). The end result of these problems is that the most 
cornnonly used mul tivariate- sorting procedure is theoretically inadequate for 
forming unidimensional item sets when dichotomously scored items are used. 

In response to the problems in the use off factor analysis with dichotomous 
variables, Christofferson (1975) has devel oped \ factor analyses procedure 
specifically for this* special case, Ir order to avoid the problems stemming 
from the use of correlation coefficients, he uses the proportions in two-way 
tables of item responses as the basic data for determining the factor structure, 
A generalized least squares procedure is used to estimate drror free proportions, 
and from these, estimate the parameters 6f the factor analysis model. The ob- 
tained parameter estimates have been shown to be consistent jind a chi-square 
test has been developed to test the number of significant factors. Although 
this procedure would seem to be the solution to the item factoring problem,, it 
can only be used on a maximum of 25 items" because of computer storage and^ com- 
putational time constraints. Thus, the procedure is not practical for most 
item pool construction situations. J , 

Another approach has been teken by Divgi ,(1980) to solve the Item pool 
dimensionality problem, but this procedure only provides a test for the pre- 
sence of a single factor, rather than^a procedure for sorting items. In ^ 
Divgi procedur^, the probability of a correct response to an item (exp'ected 
response) deternnned from a ]atent trait model is subtracted from the actual 
response to that item to ohtain a residual These residuals are then intercor- 
related over items and the resulting correlation .matrix is factor analyzed using 
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corrlV.t^^^^^^^^ '^^ factors are left in the 

does LrSolJ TMc ' 5'°"°''^ ^^'^ ^^i^^"" that'unidimensionality 

analv.?. ^? ni.h J P^-^^dure is purported to be better than the usual factor 
analysis of dichotomous variables because the correlations are based on the 
nZ lnT,' '''l'^''^'^ '"'^'^ than b.inary data. However, this pro edur very 
nrocp3.t^ . ''I evaluated. In any case, it does not yield a ^ 

procedure for forming unidimensional item sets. 

nf iti"n^'^?i*'°? procedures for determining the dimensionality 

^Uo m' ' analysis and multidimensional scaling procedures are 

1. un^n^li M* procedures make fewer as^mptions, but their usefulness 

IS unknown. Moreover, a review of the literature has not found any application 
of these procedures, to the unidimens ional i ty issue. 

f.v^-inl^^ ''^^"^^ ?^^'^^ confusion caused by the la'c^ of good procedures for- 
fonning unidimensional item sets is that often item poo\s are sorted subjective- 
ly, without the aid of an analytic procedure. In many cWes the dimensionality 
U llril/'l "3 J' "o^^^^^l^^d at all. Obviously, an elsily used* procedure 
IS needed to develop unidimensional item sets. One of tje purposes of this 
■ research is to find such a procedure. / h 



Unfortunately, the' mere fact that di chotomously\ scored items are being 
used IS not the only problem that affects the determination of the dimension- 
tlll f '2^ : For multiple-choice items, guessing is another factor 

that may affect the observed dimenstonality. A review of the literature on ' 
latent trait theory and multivariate clustering procedures has found no studies 
on the effect of guessing on dimensionality, so the magnitude of these effects 
IS unknown. However, work has been done on the effects of guessing on item 
analysis, correlation, and reliability. Some hints concerning guessing effects 
can be discovered there.- 

'1* 

Carroll (1945) st^jdied the effect of varied item difficulty and guessing 
on the magnitud-eJof correlations between dichotomously scora^J .items using 
the knowledge or random guessing model". He found that variations in both 
difficulty and chance success bring about a reduction in the size of the 
phi coefficient between items. He also discussed the use of tetrachoric cor- 
relations with dichotomously scored tes"t items, and showed that variations 
in difficulty had no. effect on the tetrachoric correlations when no guessing ■ 
was present and when the bivariate normal assumption was met. When-guessing • 
was 'present in the data the magnitude of the obtained correlations was lowered 
This effect was stronger for more difficult it^s. Along with his analysis 
of the effects pf guessing and difficulty on these two types of correlations, 
Carroll also developed correction formulae to compensate for the reduction 
in correlation. The c&rrection for the tetrachoric correlatior^ will be de- ♦ 
scribed jater in this report, since it was used in the research reported here. 

^Plumlee (1952) expanded on Carrol's wor*k to determine the'effect of vari- 
ation in difficulty and guessing on item-test correlations and reliability. . 
She developed an equation in her article that showed the relationship between 
biserial correlations determined with and without guessing present, and 
another equation that showed the corresponding relationship for parallel form' 
reliability. In both cases, the equations predicted a reduction in the magnitude 
of the statistics with the presence of guessing. 




Plumlee then checked the' accuracy of her equations-by determining the 
item discrimination values and reliability using items administered in comple- 
tion and myl tiple-choice fofln. The equatitins were used to predict the values - 
for the statistics for the multiple-choice tests froim the .completion test stat- 
istics. The predictions were' close, but there was a tendency to over estimate 
the statistics.. The differences were explained by the inaccuracy of the "know- 
ledge or random guessing model". > , 

Mattson (1965) aUo determined the effects'of guessing on reliability, 
but he used a different approach than Plumlee. Mattson used a binomial error 
model to estimate the standard error of measurement and the true score variance. 
He. then showed how the truer score variance is reduced by guessing effects. From 
the standard error and true score variance terms, he developed a formula for 
the reliability of a test when guessiag is a factor. The reliability was shown 
to decVine with increased guessing probability. 

A totally different approach to the determination of the effects of 
guessing on' rellabil ity^was taken by Denney and Remmers (1940). They felt 
that the addition of choices to a multiple-choice item was, in fact, anaVagous 
to lengthening the test. Thus., the reliability of the test with more alterna- 
tive choices could be determined from the test with fewer choices using the 
Spearman-Brown formula (a four choice test is twice as long as a two choice 
test). In their article they present data tnat showed that 1>he Spearman-Brown 
formula does model the guessing, effect fairly well.v In that study, vocabulary 
items were administered with two, three, four, or five choices and the reliabi- 
lity was determined for each of t-he test farms using the split-^alf method. 
In their article, as in all of the others, the reliability decreased with in- 
creased guessing. 

To summarize the various theoretical positions, the proportion of true 
variance in a set of test scores was plotted against guessing level for a test 
with a no-guessing reliability of .81. The results are shown in Figure 1. Four 
plots are shown on this graph. The first is the predicted reliability of a 
test as a" function of guessing for a test with a no-guessing reliability of .81. 
This plot was produced using Equation -30 developed by Carroll (1945); A 50 item 
test composed of items with .50 traditioVial difficulty was assumed in making this 
plot. The second plot shows the effects of guessing on the squace^l biserial 
correlation between an ^em of .5 traditiorral difficulty and total score when 
the no-guessing correlation is .9. This relationship was determined from 
Equation 24 in the article by Plumlee (1952). The third line shows th6 relation- 
ship between reliability and guessing given in Table 1 from an article by Mattson 
(1965). A no-guessing reliability of .8 was assumed for this plot. The fourth 
line shows the rel i abil ity 'as a function of guessing level \% determined by the 
method proposed by Denney & Remmers (1940).. The values were derived using the 
generalized Spearman-Brown formula, assuming a rel iabil ity of .81 for a test 
composed of it^ms with. 10 alternatives. • . 

As (^n be seen from this figure, the predicted reliabilities aVe quite 
different^ Other than concluding that the reliability declirtes^, no consistent pre- 
diction can be made about the magnitude of the'decline. The implication of these 
data to the proportion of common variance in a test is that guessing effects will 
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cause the common variance to decrease. The lower correlations suggested by 
Car/'olVs work would also imply that the number of factors (in a factor analytic 
sense) would increase. ' , 

Since, no clear cut findings were discovered in the reyiew of the lit^era- 
ture concerning the effects of guessing on.mul tidimensional data reductipn- 
techniques, the present research study was designed to further explore these 
effects. More spec3fica.ny,^the purpose of the research w^s to evaluate vari- 
ous procedures for forming homogeneous item sets, an^ to determine the effects 
of guessing on the techniques. Thr§e approaches were taken to achieve this 
goal/ First, a theoretical model was developed, and guessing effects were 
predicted with the model. Secondly, , simulated data wjfere generated using ttie 
theoretical model, and the predicted resul ts were checked by actual analysis 
of these data; Third, a real data-set was selected and analyzed to det^ine 
how well the theoretical and simulated results generalized. Conclusions were 
drawn from consistent patterns of findings from these three sets of results. 

The Theoretical ftodel 



The basic model used here to determine thp effects of guessing on the 
proportion of common variance in an item is a modification of the true score 
model presented in Lord and Novick (1958, pp. 30-38). A univariate model wilf 
be presented first, followed by a multivariate generalisation. 

Suppose that a population of examinees is normally distr:ibuted on a 
unidimensional trait, T, that is required, to some extent, for* per format cse ' 
on a test item. Without loss of generality, this distribution can be assumed 
to have a mean of zero and a variance of one. That, is, T d^ N(0, 1). Suppose 
further that the trait measured by a test item, T*, is not exactly the same 
as trait T, but that it has a'positive relationship with* T. /If the correla- 
.tion between the person trait, T, and the trait measured by the item, T' , is 
given by "a", then the score on Tbrait T' for-Person j, t., can be estimated 
from his/her score on T,.t., by, the formula 



! J' 



tj ,= a ^ . ■ (1) 



if a linear relationship is assumed, and if T* is assumed to have a standard 
normal distribution — that is,'T* d^ N(0, 1). . v 

If the iteq in question yields responses on a continuous sC^l^, the ob- 
served score on the item is given by the usual true s-core model as 

x.=t,+t, ,6 (2) 

where x * is the observed score on -the 'item for Person 5,. t. is the person's 
true sgore on the trait defined by the item score, and £ 1^ a random ?rror ' 
term wifichjs distributed e d^ N(0|a^), j > 0, for Person ^: 
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B-ased"on Equations l.and 2, 

* E(x:) = E(t.) + E(e) (3) 

J J I 

r = E(aT.) + 0 ' 

'• since trait estimate i. is constant for Person j. Since E(xJ is thi classi- 
' , cal definition of a trde score, the true score* op the item defined as 

*• " The variance of the observed score on Person j on the item is given by 

V(x.) = V(t'. + e) 

' > V(t,) + V(e) + 2cov(t., e). 

Since t. is constant for Person j, and since the covariance with error is 
assumed'^to be* zero, 



V(Xj) = 0 + V(e) =(j2. 



(4) 



Up to this point the expectation and variance of the observed score, 
Xj, has been obtained based on. the probabi-lity distribution .of scores for 
a single person. Similar results can also be determined for the entire popu- 
lation of individuals. Notationally this will be indicated by starring the 
subscript indicating the person.. The Expectation ' of the score on the item 
is then given by ' V ^ ^ 

^ E(XJ = &(t;) + E(£) ^ , 

= E(aT^) + 0 



= a E(T^) 
a • 0 = 0. 



The variance of scores on the test item is given by 
V(X^) = V(t;) + V(e) + 2cov(t;, e) 
= V(aT*) +CT 2 + 0 
= a2 • V(U +a2 
. =a2+^2, 



(6) 



erJc 



since T^ has a 'variance of 1.0 and the covariance of the. trait score and 
error js assumed to be zero^ If the item trait scores are assumed to be 



IG ■ 



2 2 

in standard score form, V(X^) =,a + a = 1,0, Therefore, 

• « s 

(r2 = 1 . a2. • 
Equation '4 can theh be written as 

V(x.) = 1 - a2. « (7) • 

Since the jreal interest of this>eport is the effects of guessing on 
the factor structure of dichotomoujsly scored tests, the continuous item 
score will now be dichotomized by specifying a value c related to the diffi- 
culty bf the iM>em. If x. is greater than c, a score of 1«0 will be assigned 
to Person j, and if x. i^ less than c, a score of .0.0 will be assigned. More^ 
concisely, 

1f X • > c,, u . = 1 



'Is ' ^ and if x. < c, u. = 0 



where u. is the dichotomous score for the item for Person j. ^ 

The probability that a person with ability i. will get a score of 
u . = 1 on the item' is . , 



J 



P(U. = 0|t ) = i*<f.(z)dz, ^ (8) 



C 



c - E(Xj) " c - aij 
vihere z = " =' , and ^{z) is the normal probability 



c 



y V(x.) . 



a 



2 



density funption/ The probability oY a score of u . = 0 for a person with 
abil ity T . is , ^ 

Mz) dz. 



?v^; az. 



•: p(u. - p|t ) = s 

r ^ J -00 



This is essentially the normal ogive IRT model. 

If Person j obtains a score oiF 0 on th\ item, (i.e., he/she does not 
know the correct answer), he/she may guess the correct answer with probability 
1/A, where A is' the number of alternatives in the item if it is assumed*to be 
multiple-choice. Th^t is, with^a 1/A.probabil ity, the 0 will be changed to 
a L Therefore, the probability pf obtaining a scdra of Ion the item when 
guessing is a" factor is given by 



P'(U. = 1|tJ = P(U. =*1|tJ + P(U. = 0|t ) • 1/A, 
J J J J J J 



(9) 



One way to conceptualize the effect of guessing on this item is that ' 
guessing causes the cutting score, c, to be shifted downward, increasing 
the probability of a correct response. To determine the magnitude of this 
shift, -^he cutting socre, c', that yields the correct probability of a correct 
response, including the^ guessing effect, can be determined using th^ inverse 
normal transformation: ^ • ' 

2' = . P'(U. = 1|t.)). ' (10) 

J .1 . 

The value of c' is obtained by transforming this z^-score to the observed score 
scale using ' , . / 



c . = -TTirn;:. (11) 

Note that c' has an* index, j, denoting that its value may be- different for 
each person, depending^on the ability level t.. The guessing effect' for 
Person j can then be defined as ^ 

g. = c - c. (12) 

% 

Another way of conceptualizing the effect Of guessing is that it shifts up-, 
ward the examinee's- propensity distribution by an amount g.. ^ 

Base'd on the idea oV guessing causing a shift in' a person's prot)enstty 
distribution, a new contiiruous score for an item tan be defined to include 
guessing as a factor "in the item response: 

where g^ is con^^ant for Person j on any .given item, but varies across ^people 
and items. The ex^e^ted value of j/his score for Pers'on j is ^ 

E(y.) = E(t.) + E(e) +'E(g.) ' ' 

J >» J J 

\ = B{aTj) + 0 + . j 

=ax. .gj. . (14) 

•In classical test theory, this expected value is defined as~ the true score 
on the^em for Person j (Lord & Novick, 1968). Notice that there is a. guess- 
ing comp(^nent in this classical true score. The variance of.y for Person j 
,is given by ^ * 

V(yJ = V(t,) + V(e) + V(g.) + 2cov(t., gj + 2cov(£, g.) + 2cov(t.. e) 
= 0+1- a^ + 0+ 0 + 0 + 0 * V 

= 1 ^ a^ (15) , 
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-since t. and g. are constant and the C'ovarian^:e with error. is assume'd to be'. 
J* .J , • 

zero ; 

' * • • ' * ' 

The probability of a correct response t^ an item when ability is measured 
on the y-scale (i.e., when guessing is a factor in th& item response) is given 

by ' . ' . . 



p'(u. = i|t.) = <^<z)dz, • ■{ ' 



wHere 



c -'"aj..+ g. 



• z\ "J" J 



As was done previously, these results can be generalized to apply to 
the 'scores obtained from a' group of individual,? rather than for a Single 
individual, as given in E<|«ations 8 through 16, .The expected value of the ' 
continuous item score for the population of individuals when guessing is a 
factor in "responses is. 



.E(YJ = E(t;) + E(e) + E(GJ * ^ 



= E(aT^) + 0 + E(Gj 
= a E(Tj + E(GJ 

> ' • ■ - ..' =r \ ^ (17) 

where is the random variable associated with the guessing effect. Thus, 
the average score on the item for the population is increased over the ho- 
guessing score by an amount equal to E(G^). The variance of the Y -score for 
the population is given by ► ' ' 

V(Y^) = V(t;) + V(e) + V(G*),+ 2cov(t;\ G*) + 2cov(e, G*) 

= V(aT^) + 1 = a^ + V(G*) + 2cov(aT*, Gj + 0 + 0/ " 

= a^V(T^) + 1 r + V(G^) + 2acov(T^, G^) 

^ = 1 i V(GJ + 2acov{T^/Gj. ' ^ ^ (18) 

From Equations 17 and 18, the proportion in the population that will obtain 
a 'score af = 1 can be determined* This proportion is given by ■ 



where 



^ P'(U^ = 1) - *(z)dz, ' (19)' 



c . 



c. . E(GJ . 

z' = . 

c - 



a + V(G^) + 2acov(T^,^G^) 



-The development^ of .thisr model has now reached the point where it can 
be applied to the major area of interest of this paper---determining the ef- 
fect of guessing on the proportion of variance accounted for by the common 
factors in a test. First, for the unifactor case, tite proportion of variance 
accounted for on the item in question by the unidimensional ability, t, when 
no guessing i$ present can be obtained from Equation 6'and the expression for 
the variance of the 'true scores^ t, over the, population of interest: 

.•V(t;) = V(aTj = a^ V(Tj = a^ ^ (20) * 

The proportion of observed vaViance for the item accounted for by the .true 
scores is then * ^ 

V(t;) \, •_al__ ' (21) 

Thu^S, the proportion of variance accounted for by the item trait is simply 
ithe squared correlation between the trait and the true score on the item, 

Jhe- proportion of variance accounted for by the item true scores when 
guessing is a factor in item responses is 'given by the ratio 

V(E(y ))/V(Yj. , 

"H^ ^numeVator of this ra^pp|j|e variance of the true scores, can be obtained* 
from Equation 14 as 

V(E(y.)) = V(aL.+ G.) 

= V(aT ) +^V(G ) + 2cov(aT., G.) ' 

V * , ' J J J J 

^ , . -a^ V(T.) +-V(G.) +"2ac(iv(T,. G.) ' ^ 

=_a2 + V(G*) + 2acov(T'., G.). (22) 

Using the value for variance of the observed score givfen by Equation 18, the 
ratio of the true score variance to the observed score variance is given by . 
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V(E(y^J) '_ + V(Q;;) + 2acov(T^. - '(23) 

V(Y*) • 1 + v(6*) + 2acov(T*, G*) 



La the univariate case. Equations 21 ai^d 23 simply give the reliability of a 
single tpst Item. ^ 

The /esults for the jjjjivariate case can be generalized to the multi- 
variate case by re(jefin*ing t. as 

» J 

m 

J k=l ^ ."^-"^ 

where d N(0, 1) for each j and k, the aj^ are the correlations between the 
Tji^ and the continuous score on the test items, and gi is the number of abilities 
requir^ to perform on the items. The x ,^ and t^^ are assumed to be uncor- 
rected for k 4 I. The proportion of common variance irf the no guessing case 
then becomes , * 

I • m , ni 



m 

'k 



Z a? 



k= l ^ , V' ' ^ i^) 



where 1s the cqmmupality. When guessing is a factor, Equation 23 becomes 

n « n , 

V(E(Y.)) I a2 + V(GJ + 2 z a.cov(T^ , Gj 
^ . l_ = k=l ^ . k=l ■ ^ . (26) 

V(Y*)- * J + v(Q^) + 2 £ a^cov(T*. , Gj ^ • 

k=l 
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Predictions fro m the Theoretical Model 




With the development ^of the theoretical model presented on the pr^ious 
pages, it Is possible, td determine the magnitude of guessing effects for per- 
sons of a given ability, and for populations with known distributiojis of ability, 
assuming the "knowledge or random guessing model" is correct. For example, if 
an individual with known ability -1 is administered an item with diffi- 
culty .5 for the population as a wnole, a guessing level of .05, and a cor- ^ 
relation between the item and. irait T of ^9, several important features can 
be determined. First, the expected score on the trait , defined by item per- 
formance can , be obtained from Equation 1 as -.9. The variance of the estimatel 
on the item .trait for the person i's given by Equation 7 as ,19, 

Based on a cut score iDf 0,0 for the pofjulation for the no-guessing case, 
the probability that Person j obtains a correct response to the item can be 
o„btained from Equation 8 as '.019. After introducing the effect of guessing 
int9 this item, this person's probability of a correct response is .069 
(from Equation 9). This change in probability requires a shift in the person's 
propensity distribution of 1,49 standard deviation units, yielding a guessing, 
effect from Equations 11 and 12 of .252. ' ' 

This same procedure can be followed for all levels of ability in the 
population. If the probability distribution of ability in the population is 
known, an expected guessing level for the pbpulation as a whole can be deter- 
mined using 



Unfortunately, g has a functional form that contains the inverse normal 
function, so direct computaftion of the expected value is impossible. There- 
fore, for the purposes of this report, E(G) fias been computed using the caut 
adaptive Romberg extrapolation method (IMSL, 1979) of numerical integration. 



Table 1 gives the magnitude of the expected value of the guessing effect 
for combin'ations of the. probability of guessing on the item and the correlation 
between, the item trait, t, and the person trait, t. The probability of guess- 
jng is defined here as the probability of a correct response for a *pet*son with 
no knowledge of^the ipaterial measured 'by the item. The correlation between 
the item and person traits is the same as the loading of the item oh '-the first 
facter of a test measuring the person trait. A cutting score of 0.0 was used 
for all combinations of guessing level and factor loading because any other 
cutting score would yield a simple Tinear transformation of these results. 



00 • 



■ Efc) = i g f{g)dg. 
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Most of the results presented in Table 1 match what would commonly be 
expected of a guessing effect. As the probability of guessing increased, the 
guessing effect increased. However, for low guessing probabilities the guess- 
ing effect increased with increased factor loading, while for high guessing 
probabil ililes, the guessing ef^fect decreased with increased factor loading. 
At a guessing probability of approximately ,25,. the guessing effect was fairly 
constant*. This interaction of guessing probability and factor loading was 
unanticipated. 

Table T 



Expected Value of the Guessing Effect 
for c. = 0, and Various Combinations 
of Guessing Level a%d Correlation 
Between Item Trait and Person Trait 



Guessing Correlation Between PBrson Trait and Item Trait 



Level ' \ 

.6 .7 ' .8^ ^ .9 

^. : / / . - 

.05 ' .07 ^8 .10 *\ .14 

3^5 .19 ,19 .21 .24 

^ ^25 ,3(r .30 .30 .31 

m ' .35 .41 .40 .39 ;38 

.45 .53 .51 ,48 , ..45 ^ 

.55 .66 .62 . .58 .51 

.65 .80 .75 • .68 .59 

.75 .98' .91 .81 . .68 



Note . Expected values were based on a N(0, 1) ability distribution. 



The reason for this interaction can be determined from Figures 2a, 2b, 
2c, and 2d, which show the probability of a correct response to the item 
with and without guessing, the guessing effect at various ability levels, 
and the ability density function for guessing pr^obabil i ties and first factor 
loading of (.05, .^6), (.05, ,9), (.45, .6) and (,45, ,9), respectively. From 
these figures it can be seen that the magnitude of guessing increases more 
quickly with decrease, in ability for the .9 loading case than the .6 loading 
case. This yields a higher expectation for the .9 loading case with a .05 
guessing leveLthan for the .6 loading case, because the guessing effect 
reaches an appreciable size w^ithin the range containing' most of the ability 
distribution in the former case, but not in the latter. When the guessing 
probability is .45, the higher guessing level over the entire ability range 
for the .6 loading item overcomes, the steeper slope of the guessing effect 
for the .9 loading item. In other words, the guessing effect is greater for 
the poorer item over a wider range of ability. ^ 

17 
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QEBE) 



P • 1^} 

CtLSiiNG EFFECT 



F (X) 




■'2.00 O.OO 
flBlLI TT 



2.00 0.00 
ABILITY 




■2.00 O.OO 
ABILITY 



2.00 0.00 
ABILITY 



O 



.15^ 



From a practical point of view, these results suggest that guessing at 
reasonable levels is a more serious problem for high tjual ity items than low 
quality items. In the latter case, the errpr variance in the item masks the 
guessing effects. Of course, th-is conclusion assumes the correctness of the 
"knowledge or random guessing model". 

Although the magnitude of the guessing effect has resulted in some in- 
teresting findings, the more important area of interest in this report i's 
the reliability, or proportion of coninon variance as a function of guessing 
level. This value can be determined from Equation 23, but first the variance 
of the guessing effect, and the covariance of the guessing effect and ability 
are required. The formulae used to obtain these statistics using numerical 
integr-ation are given by: ^ • • 

V(G) ='f (g - E(g))2 f(g)dg, 
0 

and « 

00 

cov(t, G) =/ T(g - E(g)) fdldi. 



'The expressvon in the equation for the covariance is integrated over t, since 

g is a function of i. It should be recalled that t d N(0, 1). < 

~ t 

Table 2 gives the variance and covariance values for the same probability 
of guessing values and the level of factor loadings used in Table 1, From 
this table it can be seen that as the guessing level increases, the variance 
increases, and the covariance of guessing and ability becomes more negative. 
Also, the same trend can be seen as thp factor loading increases. 

The negative covariances were expecteci in these results, since low 
ability individuals guess more often the high ability indlvi dual s . The 
increase in varian-ce was also expected.. As the guessing level increases, 
the guessing effect function shown in Figures 2a through 2d is shifted up- 
ward, demonstrating a greater range of guessing effect. With increased 
factor loading, the guessing effect function increases more sharply, re- 
sulting in the greater magnitude of the variance. It is not surprising 
that as the variance increases the covariance also increases in absolute 
vatue. * ^ 

/ 

From the variance cTf the guessing effect, the covariance of guessing 
and ability, as well as the factor loading, the proportion of variance in 
item responses accounted for by ability can be determined from Equation 23. 
These proportions are presented for the cases used in Tables 1 and 2 in 
Table 3. In addition to the Q.O cutting score case (corresponding to .5, 
traditional difficulty for the group), the proportions are presented for' 
the .^S-and •25 traditional difficulty cases. 
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4. ' Table 2 

Variance of the Guessing Effect and 
the Covariance of the- Guessing Effect and the- 
Trait Level for C^. = 0 and 

Various Combinations of Guessing Level 
and Factor Loadings 

^> 

Guessing First Factor L&ading 



Level 







.6 


.7 


.8 


.9 


.05 


VAR 


.00 


.01 > 


.03 


.08 




LUV 


- .05 


-.07 


-.12 


-.20 


.15 


VAR 


.02 


.03 


.06 


.12 




GOV 


-.11 


-.15 


-.20 


-.28 


.25 


VAR 


.03 


.05 


.09 






' GOV 


-.15 


-.20 


-.25 


■\l 


.35 


VAR 


.04 


.07 


.I'l 


' .18 




GOV 


-.19 


-.24 


-.30 ' 


-.37 


.45 


VAR 


.06 


\09 X 


' .13 X 


.21 




GOV 


-.23 


-.28 ^ 


-.33 


-.40 


.55^ 


VAR 


. .07 


..11 


.16 


.23 




GOV 


' - .26 


-.31. 


-.36 


r.43 • 


.65 


VAR 


J . .09 


.13 


.18 


■'.25 




GOV 


-.29 


> -.34 


-.40 


-.46 


.75 


VAR 


.11 


.15 


.21 


. .28 




GOV 


-.32 


-.37 


-.43 . 


'-.49 



( 



Table 3 

Proportion of Variance Accounted for in Item Responses 



by Guessinfg Level, Factor Loading, 
and Cutting Score 



Cutting 


Factor 








Guessing Level 










Score Loading 


.00 


^^.05 


.15 


OK 


• oD 


.45 


.55 


.65 


.75 




.9 


.81 


.73 


.69 


.66 


.64 


.61 


.59 


.56 


.53 


0 


.8 


.64 ' 


.57 


.51 


.47 


.44 ■ 


.40 


.37 


.34 


.31 




.7 


.49 


.44 


.38 


.34 


.31 


.27 


.24 


.22 


.19 




.6 


.36 


.32 


.28 


.24 • 


•21 


. .Ife 


.16 


.14 


.12 






.81 


.80 


.7-8 


.77 


.76 


.74 


.73 


.72 


.70 




.8 




.62 


.59 


.57 


.55 


.53 


.50 


.48 


.45 


-.6745^ 


.7 




.47 


.45 


.42 


.40 


.38 


.35 


.33 


.30 




.6 


' .36 


.35 


.32 


.30 


.28 


.36 


.24 


.22 


.20 


i 


.9 


.81 


.60 


.52 


.47^ 


-^43 


.39 


.36 


.33 


.29 


.6745 


.8 


.64 


.45 


.34 


.30 


.26 ' 


.22 


.19 


.17 


.14 




.7 - 


.49 


.35 


.26 


.21 


.17 


.14 


.12 


.10 


.08 




.6 


.36 . 


'.26 


'.18 


.14 


.11 


.09 


.07 


.06 


.05 



Note. Ability is assumed to be distributed N(0, 1). 
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S!gher the actu V Iup'h' ''h' "u^ ^'^^'"^ substantially 

the iJpm. %h»n- • ^- ^?Pe"di"9 on the distribution of difficulties of 
Because th! ' S^^ssing levels, and the interitem covaHan^es , 

test sina th. °^ determining the reliab?inrof*a ' 

Item ?s presented proposed, only the reliabiWty for a single 

Evaluation of Empirical Item Sorting Procedures * 

nf .J^'"" a theoretical analysis of the effects of guessinJUn the oroDortion 

?ea ?tec" 'and'J'he ' ''""'T' °^ '"^"^ not'p^sible/ he .o ^ 

ava therefore more complex, cases were studied by applying the 

Tt Cts ^^?;"/°''^t]"g P^cedures to, various stimulated data-se?s a S^eal 

data-sets The basic design for this componenjt of the research studv w.^ to 

produce Item sets with known structure using Emulated and real Lst results 

an t en attempt to recover the structure using each of ev a I bl^ ch- 

niques. The techniques considered included: factor analysis clu^tpr .Li!c?c 

-nonmetric mul tidimensic/nal scaling, and -latent frait theory. ' 

Besides the choice of techniques to be used on the item response dat^ 

.another decision needed to be made concerning the coeffic en? ised a a measure 

of similarity between the items. Factor, analysis is rather lim ted n th?s 
choice, being tied to correlation type statistics Cluster an^ vsi. JnH nL 

po\'s>bi:?iv'^'r""'°"^ ^"^^'"3 '^'^ thi-"iim?L"u:^/ 't r 

possibility of using many other measures of sifn'larity. Therefore the fol- 
owing coefficients were applied to th^data. and the' ar ous techniq es w re 
appl edrto each: phi coeffiqientr tetrachoric correlation, corrected tetra'- 

oS:an/Kruskarr '''''' ^' '''''' ^' '^^''''^ sco;e °L al 's t B, 

and Reference L p^T ' II'''' '"^ the^Lijphart indfex. The formula 
and reference for each of these coefficients is given in Appendix A. 

Item Sorting "Procedures _ 

Each of the techniques used in the analysis of the' item response data * 
has many variations in basic procedure a^ well as several option^a to spe- 
cific method of application. Therefore, tefore describing the Research design 
Hy u'nImblguoSs! '^^'"'^"^ '''''''' describe'd to' Se'JSeir'ldl^t- 
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Factor Analysis Two basic factor analysis procedures were used on the 
^data: the method af principal components, and the methoc) of principal factors. 
These two procedures^ differ mainly in that the former assumes that all of the 
variance influences the magnitude of the correlations, while the latter assumes 
that some variance is unique to each item and a reduced number of- factors 
(less than the number of i,tems) explains the cdrrelations. Although the latter 
procedure seems more reasonable, both were used on selected data-sets to deter- 
mine their relative value. 

* 

In addition to the basic factor analysis procedures, two types of' ro- 
tations were used to help in the interpretation of the results. The two ro- 
tations used were VARIMAX, and OBLIMIN.- These two were selected because of 
their general availability, and because they allow.ed comparisons between 
orthogonal and oblique solutions. ^ t 

The factor, analy«ses were run oh only three of the similarity coefficients 
mentioned above: the phi coefficient, the tetrachoric correlation, and the 
tetrachoric correlation'corrected *for guessing (Carroll, 1945). The other 
coefficients were not used becayse they did not even approximate the assump- 
tions of the fattor analytic model. 

4 

because of the different factor analytic options .available in different 
packages, four different packages were used to perform the analyses. These 
included SPSS (Nie, Hull, Jenkins, Steinbrenner anl^d Brent, 1975), SOUPAC 
(Computing Services Office, 1974), OSIRIS HI (institute for Social Research, 
1974) and SAS (Barr, Goodni^g)it, Sail, and Helwig, 1976). - In some cases., the 
same analys^es were run using two .different packagjas to check their compara- 
bility. Differences in results obtained from the different package programs 
were minor. 

Cluster Analysis Two different cluster analysis approaches were- taken 
for this study. Jhe first, labeled CLUSTER for this report, builds clusters 
of items one at a time. The procedure first searches the input similarity 
matr4x^ for the two items with the highest similarity; The matrix "is then 
searched for the item that has the highest mioimum similarity to the three 
items in the cluster. This item is also added to the cluster. This procedure 
continues until no items can be found with a similarity greater than a pre-set 
cut-off value. At that point the matrix is again searched for the two items 
not included in the first cluster with the highe§± similarity. These two 
items form the beginning of a new cluster. The clustering procedure then 
continues until all items are used or unt*l tfojie of the similarities exceed 
the cut-off value. 

The second clustering procedure used, ca-lled HICLUSTER in this report, 
is a hierarchical clustering procedure. In this procedure, the most similar 
pair of Items is connected first, then the next most similar, andiHt^ on., to 
form initial clusters.. These initial clusters are combined when all of the 
points in one cluster are connected to, all of the points in another cluster. 
Clustering in th'is procedure continues^ ui^til all of the items are included 
in, one cluster. All of the similarity coefficients listed above were used 
with both of thes/e procedures. • ' 

/ - 
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• Both of the clustering procedure used for this study were applied using 
programs from the OSIRIS HI <computer program package* Al though, this pack- 
age is not as W'idely ^vai lable as SAS or SPSS, th6 c1usj;ering routines from 
this package were used because of. their greater versatility, 

flonmetric Multidimensional Scaling The nonmetric multidimensional . 
scaling procedure used for the data analysis in this study was the basic 
MCSCAL procedAjre developed by Shepard (1962) and Kruskal (1964). This pro- 
cedure rank orders the similarity of the items' used in terms of the specified 
similarity coefficient used, and then attempts to define a space of minimum 
dimensionality such that the distances oetween the items in the space are 
ranked in the same order as the initial similarities. The procedure uses a 
steepest descent iterative approach to improve the relationship between the 
^ spatial configuration and the initial similarities, ^.When the rate of improve- 
ment levels off, the solution is accepted, A Euclidean metric was used for 



all of the analyses using this procedure. 



The OSIRIS HL version of MDSCAL was used for all of the analyses pre- 
sented in this report. Each of the coefficients mentioned above was' used 
as a similarity measure for this procedure, since tt make€ no, assump^tion other 
than an ordinal scale concerning the coefficients/ Although numerous other 
multidimensional scaling algorithpis exjst, this algorithm was selected because 
it is widely available. ^ 

Latent Trait Analysis Although latent trait analysis is not commonly 
thought of as a multidimensional clustering technique-, some results obtained 
in previous^ research suggested that it might be used as such (Reckase, 1979). 
That research suggested that whan several factors are present the LOGIST 
(Wood, Wingersky, & Lord, 1976) item calibration program selects one factor 
as abasis for item calibration. Thus, items with high discrimination para- 
meter estimates should be from the same latent dimension, while those Jwith 
low estimates should be from other dimensions. By deleting the highly dis- 
criminating items -after each i^n of the pro'gram, another set of highly dis- 
criminating items may^ be found that* measure a different latent dimension. 
Thus, iterative use of the program, with item ct^letions between successive 
iterations, may yield sets of homogeneous items. It- was the purpose^of the 
analyses performed for this research to determine if that^,were indeed the-t:^3e 



Data-Sets 



V 



As mentioned earlier, the item sorting procedures were applied to two 
kinds of data-sets: simulated' and actual test data. The sinulate^ responses 
of examinees to itemt were used so that precise control could he maintained 
over 'the dimensionality of the data. The "actual test data were used to get 
a more realistic evaluation of the procedures. The production procedures and 
characteristics of tbe data-sets will now be described. 
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Simulated data A total of 24'*s'1mu1ated data-sets were produced, for 
this study. These data-sets all represented the responses to 50 items by 
1000 individuals. They^varied in the number of dimensions used to generate 
the responses, the distribution of item difficulties^ .the guessing level, 
and the distribution of the guess.ing level. ^ All of the' data-sets were gen- 
erated using a variation of the procedure described by Wherry, Nay lor,. Wherry 
& Fall is (1965). 

This' procedure generates data using the basic tinear Vactor analytic 
^ model. A mora detailed description of the procedure is given in l^ecka^e (1979) . 

J' 

The procedure developed by Wherry et. al . ^genei^ates data to match, a 
specified factor structure, but does not include a guessing effect. There- 
fore, after the simulated responses were produced usm§ the above procedure, 
the incorrect responses to an item were randomly c>€fffged to correct r^^sponses 
at a rate equal to the guessing probability for the item. This was done by . 
comparing a flat random number on the 0.0 to 1.0 range with the guessing level 
for the item, and changing an incorrect response to a correct response if the 
selected random pumber were less than the guessing level. 

The total list of data-sets produced for the study are presented in 
Table 4. As can be seen from the table, more than half of the data-sets pro- 
duced used only one generating factor. These data-sets were produced t^^e- 
termine the effect of guessing on the obtained dimensionality of a set of test 
data. Both the level .of guessing and the distribution of parameters were var- 
ied for these data-sets. 



The next set of data-sets listed in the table used two orthogonal factoVs 
to generate the item responses. This set of relatively simple multidimensional 
data-sets was used to determine wht^h". procedure could adequately find the homo- 
geneous item sets within the teit. If a procedure v^re not successful on this? 
"easy" set of data, it was eliminated from consideration. 

The remaining simulated data-sets used three or nine orthogonal factors 
to generate the item responses. ^ These data-sets were generated tt have a 
large first fao^tor to more accurately simulate what was believed to be a 
realisti^c state of nature. Only item sorting procedures that succeeded on the 
two-dimensional data were applied to these mor^ complex data-setS. 

Real Data The real data-set lised in this study was produced by sampling 
items and responses from the results of the 1975-76 administration of the Iowa 
Tests of Educational Development (1972). The desire here was to produce a test 
with two underlying dimensions that contained all the sources of variation 
present in typical test, administrations. To achieve* the desired dimensionality, 
items were selected from the Expression ^nd Quantitative Thinking subtests of 
the ITED. These two subtests were judged to be most dissimilar,* and so most 
'likely to y.ield the desired structure. A total of 50 items were randomly 
selected from the 105 items in the two subtests using a stratified randdm' 
sampling approach. Thirty-three of th^ items in this data-set were from the 
Expression subtest and 17 were from the Quantitative Thinking subtest. 



Table 4 
Catalogue of Data-Sets 



Dimensionality of Data Set 



Labels 



/ 



1- Factor 



2-Factor ■ 
' ^ 3-Factor 
9-Factor 



SD150N.CG00, 
SD150R,CG15, 
SD150R.CG25, 
SD150N.CG35, 
SD150R.CG45,- 
SD150N.CG65, 
SD150R.CG75 



SD150R.CGO0, 5D150R.CG05 
SD150R.NG15,^'SD150N.CG15 
SD150R.NG25,*^1)150N.CG25 
SDi50R.CG35*/'tDl50N.CG45 
■SDl50N.Ca55, SD150R.CG55 
>0f5OR,CG66, SD150N.CG75 



SD250R.CG00, SD250N.NG20, SD250R.CG25 
SL350N.NG20 

S[950N.NG20 , ' . 



The label the data-s€t describes the data-set. The fir:st two letters 

the number 
contained 50 items. The 



stand for Simulation data, 



of factors and number -of items. vAll data 



The .next three o.c four digits tell 



sets 



letter following the 50 tells the^distribution of traditional item diffi- 
culties: ^ ap R me aping normal or rectangular, respectively. Following 



)rmal ^ , , _ 

the perioxi is CG or llG, stan/ing for constant or norma-Vly distributed 
guessing. The final two digits give th^ guessing lervel. The values gvVen^ 
are the guessing 
data-sets. 



level 



for CG data-sets pr the mean guessing level for'NG 
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Research Design 9 

The basic research design of this study contains four componeivfes^ First, 
the four. techniques of interest (factor analysis. clust^f-analysfsTnon^ric 
mulfi dimensional scaling, and latent trait analysis) were applied to the one 
dimensional data-sets, with guessing varied, to attempt ^ discover the effect 
of guessing on the techniques. This was done by plotting various characteristics 
•of the techniques (e.g., size of first eigenvalue) against the guessing level 
to determine if any relationship existed. Also, the structure of the data -sets 
Was considered unknown and the results of the procedures were analyzed to deter- 
mine If the unidimensional structure could be discovered. - This set of 'analyses 
fjrmed a basis for'rpmparison for all" of the subsequent analyses. 

The second af/alys is component consisted of applying the four. techniques to 
the two. threey^nd nine-dimensional data-sets. For each of the data-sets, an 
attempt was ni^e to recover, the underlying structure of the data. If a procedure 
failed'ferf-riow dimensional data-set, it was not" used with the more complex 
data-s^ts. 



The third, analysis component consisted of applying the fqur techniques to 
the real data-set. The procedure used with the real data-set was similar*, to 
that used with the simulated data-sets. The technimjes were evaluated on their 
ability- to reproduce what was thought to be the underlying structure of the 
data-set; In this ca'se the data^set was constructed to have two components, 
but since the true structure could not be determined with certainty the inter- 
pretation of the results was much more cautious'. 

The final analysis perforjned for this study was. the comparison of the 
results obtained using fl^simul ation data with those suggested by the research 
li.terature. That is, the obtained reliability as a function of guessing was 
compared to the theoretical pr^ictions. ' ' • 



Resul ts 

One-Dimensional S imulated Data .• 
— ■ ■ 1 ■ •■ — ■' ■ 

The results of the application of the four techniques to the one-dimensional 
simu-lated data will be presented first. The factor analysis results Mill be pre- 
sented first, followed by the- mul tidimensional scaling, cluster analj^i^, and 
latent trait analysis resi^lts. 

Factor Analysis The first analysis performed uslrrg the factor analyst 
pfo(ftdure was determination of the relationship between the size of the- first 
factor* on the test and the magnitude of the guessing, component contained in 
}he responses to each. Item on the tesjE.^o obtain tills information, a principal 
:omponents factor analysis was performeraTon tetrachoric correlations for eight 
Jata-sets. These data-set^ were all generated using a normal distribution of 
fradltional Item-difflCulty centered around'. 5.' Each was generated"us1ng,a 



/ 
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constant guessing leveK The guessing level used were 0, ,15, .25^ .35^ .45, 
.55, .65, and .75. All data-sets were generated so that each Item had a .9 
loading on the first factor before the guessing effect was added* 

To show the relationship befween the guessing level ^and the size of the 
first factor, the proportion of total test variance accounted for by the first 
factor was plotted against the size of the guessing component. This plot is 
given in Figure 3-, along with a plot of the KR-20 reliability against' the guessing 
level. As can be seen from .the plot, the proportion of variance accounted for 
by the first factor dropped off substantially with an increase in guessing. At 
the .15 guessing level, the propo>ti«| of variance had already declined to .62 
from the .83 obtained for the no guessing case. It is interesting- to, note that 
the decline ^in the KR-20 rel iabil i ty is not nearly as dramatic, showing its 
insensitivity to guessing'ef fects. 

Along with the analysis of the proportion of variance accounted for by the 
first factor, an attempt was also made .to determine if guessing induced additional 
_faCt_ors_ in the Xest,. That is, "bid the decline in the first factor indicate the 
presence of other factors. To determine this, the nuraber of factors in each 
factor analysis was determined using the skree technitjtie. The factor loadings 
for those factors were than studied to determine whether they were interpretabl e. 
For all cases except the .00 and .75 guessing level data-sets, two factors seemed 
to be present in the data. The second factor for all of the two factor cases 
loolced like a guessing facJ:or, with high loadings for the difficult items. For 
higher guessing levels, the second factor wajs not as clear, disappearing altogether 
for the .75 guessing level data-set. 

In addition to the creation of a second factor in these d&ta-sets, the 
loadings of the items on^'the" first factor were also affected. They were found 
to decline with an increase in the difficulty of the tes\ itemsr. Since this 
did not occur ^for the .00 guessing data-set, the effect can be attributed to 
guessing. 

' Since the guessing factors were 'defined by loadings on the hard items, it 
would seem' reasonable t:hat the distribution of item difficulties would interact 
with the guessing effect. To te&4^this conjecture, a data-set was produced with 
a rectangular distribution of difi^culty rather than a normal one, as in the 
previous data-sets, and a .25 guessing level. The results of the principal 
component analysis of this new. data-set showed that the presence of items of more 
extreme 'difficulty had the effect of reducing the proportion of variance accounted 
for by the first factor 'from .50 to .41 and increasing the number of factors in 
the data-set. The shree technique indicated four factors, but only three were 

-readily'interpreted. The second and third factors for this data-set both seemed 
to be guessing factors. For comparison purposes, the factor loadings for the 
fi?st two principal components from the normally distributed data-set, and the 
first three from the rectangularly distributed data-set are presented in Table 5. 
Not-i,ce. the, guessing factors in the data and the* decline .in the first factor 
loa^dings with the increased difficulty of the items. Several other data-sets 
were produced with rectangularly distributed difficulties and their analysis 

, produced similar results.. 
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FIGURE 3 

, PROPORTION OF VfiRIfiNCE 
IN THE FIRST FfiCTOR 
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ttS R FUNCTION OF. GUESSING LEVEL 




^.00 



Jl 



PROPORTION 8F VftRIftNCE 



^ — — 
X — X — X 



0.25 '0.50 ^.75 

'GUESSING LEVEJ 




"1 



1.00 



ERJC 



20 



Table '5 • ^ 

Principal Component Factors from Tetrachorif Correlations 
t,for Simulated Tests with Normal apd Rectangular Distributions 
of Difficulty and .25 Guessing Level . 



Item 


Difficulty 


I ' 


' II 




Diff icultv 

Mill IV«UI V V 


I 


II 


III* 

if 


1 


15 


39 


• 34 




01 


06 


~ 13 


. * 31 


2 


18 


44 


. 37 




03 


09 


23 • 


07 


3 


'19 


55 ' 


33 




05 


17 




36 


A 

4 


21 


50 


48 




0? 


12 


29 ' 

> 


. 42 


D 


22 


59 


22 




osi 


20 


36 


31- 


0 


. 27 


60 


27 




11 


25 


25 


19 


7 


29 


64 . 


-^l . 




13 


30 


tl 


38 


8 


29 


58 


37 




15 


29 


43 


. 05 


9 


29 


64 


24 




17 -■ 


41 


23 


37 


10 


31 


66 


26 




19 


39 


31 


32 


11 


33 


68 


21 




2i 


53 


25 


. J4 


12 


34 • 


68 


20 




23 


49 

• 


33 


18 


13 


34 


71 


18 




25 


47 


36 


-14 


'14 


37 


65 


18 




27 


54 


32 


■ ' 14 


15 


38 


£8 

"a 


17 




29 


^55 


40. 


-03 




39 


74 


13 




31 


'63 


31 


-01 


17 


41 


68 


16 




33 


55 


36 


-04 


18 


42 


73 


-01 




35 • 


66 


24 


-15 


19 


42 


• 67 


13 




37 


62 


32 


-05 


20 


47 


7^ 


12 




39 


66 


28 


-07 


21 


48 


73. 


07 




41 


63 


34 ■ 


" . -26 


22 


48 


73 


01 




43 


68 


' .26 


01 


23 


52 ' 


72 


03 




45^ 


66 


28 


-li 


24 


52 


.72 


-06 




47 


74 


19 


-07 


25 


52 . 


72 


-01 




49 


71 


20 


-19 



Note . All values are presented without decimal points. 
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Table 5 CContlnued) 
Principal Component Factors from Tetrachoric Correlations 
for Simulated. Tests with Normal and Rectangular Distributions 
of Difficulty and .25 Guessing Level 



litem 


Difficulty 


Factors (Normal Distribution^ 


Difficulty 


Factors 


(Rect. 


Distri 


T 


TI 


I 


IT 


III 

1 X A 


"26 


54 ■ 


77 




51 


72 


I 


-14 
"i*t 


• 27 


54 


74 ^ 




53 ■ 


70 


08 


06 


28 


54 


73 




55 


76 


10 


-11 


' 29 { 


55 


73 


08 


57 


73 


14 


-13 


30 


55 


75 


-07 


59 


75 


09 


-14 


31 


56 


78 


-04 


61 


78 


-10 


17 


32 


57 


78 ^ 


-10 




r 


06 


-21 


33 


58 


75 


-08. 


' 65 


77 


-09 


-07 


• 34 


58 


74 


06 


67 


79 


-08 . 


-09 


3b 


58 


77 


-07 


69 


75 


-09 


-15 


36 


60 


76 


-04 


71 


81 


-18 


-01 


37 


60 


76 


r -18 


73 


77 


-08 


-07 


38 


60 


74 


. -14 


75 


78 


-14* 


-10 


39 


61 


74 


-10 


77 


78 


-20 


-13 


40 


61 


79 


-18 


79 


78 


-25 


-05 


41 


62^ : 


77 


-15 


81 


77 


-23' 


-01 


42 . 


64 


77 


AO 


83 


77 


-23 


" -05 


■ 43 


64 


'78 


-21 


85 


79 


-32 


-01 


44 


65 


' oU 


9/1 


87 ' 


79 


-35 


13 


45 


65 




-26 




77 


-03 


ni 


46 


66. 


74 


-27 


91 


77 


-36 


-04 


' 47 


69 


78 


-32 


93 


76 


-37 


. 19 


. 48 


. 70 


77 


-35 


95 


79 


-49 


17 


49 


70 


75 


-35 


97 ■ 


75 


-43 


02 


50 


79 


71 


-42 


99 


63 


-60 - 


55 



Note. All values are presented without decimal points, 
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the effects of guessing on the ^idimensionaV simulation* data. First, guessing 
reduced the contribution of the first principal component to the fa^ctor analysis 
results. Second, the loadings of the items on the first factor were reduced to 
the extent that guessing .affected the items. Third, extra factors which seemed 
< to be guessinrg factors were present irr the principal component solution. Similar 
results wereS^tained when the principal factor procedure was used instead of 
the principal cssponent solution. 



Since the purpose of this report is to find methods for recovering unidimen- 
tional sets of items from a test, one furtjier analysis was run on the one-factor 
rectangular distribution of difficulty data-set with '.25 guessing. The purpose 
of this analysis was to determine if the correlation matrix could be corrected 
for guessing. Carroll's (1945) correction for the four-fold tables used to' 
compute tetrachoric correlations was selected for this purpose. Since the true 
guessing level for an item is not usually known, the data were corrected for 
guessing using .15, ,25, and .35 guessing levels. The corrected tetrachoric 
correlation matrices were then factor analyzed using the principal component 
technique. The first two factors obtained for the corrected matrices and the 
uncorrected solution are shown in 'Table 5, 

^ The most obvious result that can be seen In Table 6 Is that overcorrecting 

for guessing (.35 correction) results in a very unusual factor analysis solution. 
Tfte first seven items defined unique factors, and many of the factor loadings 
were essentially 1.0. Overcorrecting for guessing clearly does serious harm 
to the factor analysis, resulting in meaningless results. 

Correcting for guessing at the .I5t and .25 level gave more reasonable 
results. The first factor loadings were increased above the uncorrected values. 
In many cases the *.25 correction yielded loadings close to the .9 values used 
to generate the data. The .15 correction did little to remove the second factor 
from the solution. The .25 correction did tend to restrict the influence of trte 
second factor to fewer items, mainly the most difficult items in the data-set. 

In gener*al , these results indicate thar*the correction for guessing for the 
tetrachoric correlations has some merit, but care must be taken not to over- 
correct. The first factor loadings are improved by the procedure, but the cor- 
rection did not totally remove tha second factor, which was attributed to guessing. 

Nonmetric multidimensional seal ing The first analysis performed using the 
nonmetric multidimensional scaling technique was the application of the MDSCAL 
program to the one factor data with a rectangular distribution of item difficulty 
and no guessing. This analysis was performed on the data using each of thirteen 
similarity coefficients. These included the following: agreement coefficient, 
koppa coefficient, kappa coefficient, Lijphart index, KervdalVs tau B, approval 
score, phi coefficient. Yule's Q, Yule's Y, phi over phi max, gamm?, tetrachoric 
correlation, and eta coefficient. Each- of these coefficients is described in 
Appendix A. This Targe set of coefficients was used since HDSCAL does not require 
any special characteristics in a measure of similarity, and it was hoped that 
one of these coefficients woald be less sensitive to guessing effects than the 
others. * ^ 
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Table 6 

Factor Pattern Matrices, for a Two-Factor 
Principal Component Solution of One-Factor Data with .25 Guessing 
with Various Levels of Correctipn for Guessing 







No 


Correction 


^ 

.15 Correction 




.25 Correction 


.35 


Correction 


Item 




I 


II 


I 


II 




I 


li 


I 


II 


1 




06 


13 


14 


24 




64 


75 


00 


00' 


2 




09 


23 


17 


42. 




78 


92 


00 


00 


3 




17 


15 




18 




99' 


48 


00 


' 00 


4 ' 




12 


29 


24 


51 


• 


66 ■ 


63 


00 


.00 


5 • 




20 


36 


■ ■ 32 ■ 


59 




71 


69 

• 


00 


00 


6 
7 




2b 
30 


25 
11 


38 
49 


36 
'34 




75 
91 


30 
14 


00 
00 


00 
00 


8 




29 


43 


41 


.56 




64 


51 


97 


-61 


9 


41 


23 


60 


22 




90 


" 09 


101 


' 07 


10 




39 


31 


53 


37 




11 


33 


100 


-03 


11 




53 


25 


75 


15 




100 


-06 


101 


07 ■ 


12 




49 


33 


67 


30 




95 


-06 


101 


07 


13 




47 


3§ 


60 


34 




79 


-01 


99 


-48 


14 • 




54 


32 


71 


28 




95 


12 


101 


07 


15 




55 


* 40 


70 


40 




94 


10 \ 


101 


07 


16 




63 


} 31 


79 


26 




100 


00 


101 


07 


17 




55 


36 


66 


35 




80 


17 _ 


99 


-29^ 


18 




66 


24 


81 


10 




9ft 




101 


08 


19 




U 


32 


74 


26 




87 


19 


100 


.03 


20 




66 


28 


78 


23 




94 


13 


101 


07 


21 




63 


34 


73 


28 






■13 


91 


• -39 


■^22 




68 


26 


78 


24 




93 „ 


20 


101 


04 


'23 




66 




75 


25 


* 


86 ' 


19 


98 


-04 


24 




74 


19 


86 


09 




97 


03 


101 


-01 


25 




71 


20 • 


80 


13 




93 


08 


99 


-23 ' 



Note. All values are presented without decimal points. 
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Table 6 (Continued) 
Factor Pattern Matrices for a Two- Factor 
Principal Component Solution of One-Factor Data with .25 Guessing 
with Various Levels of Correction for Guessing 







Nn 
I1U 


UfOt icl u ion 


.15 Correction 


.CD 


!)orrection 


o c 

.35 


Lorrectic 


T tpm 




T 
i 


T T 
1 1 . 


I 


' II 


T 
I 


II 


•I, 


II 


26 




72 


18 


82 


06 




-Ui 


QQ 


n? 


^ 27 




70 


08 


79 


O 1 

01 






Q? 




28 




76 


10 


o c 

oo 


AO 

^ 03 


Qf% 


on 


lUi 


• no 


29 




73 


14 


80 


* AO 

09 




1^ 




1 ft 

-lO 


30 




75 


09 


oZ 


AO 

03 


Q1 




Q? 


1ft 
-lo 


31 




78 


-10 


* 85 


1 o 

-12 




-flft 


QQ 


uo 


32 




72 


06 


78 


AA 

00 


M 
o*t 


-Id 


* 3>J 


9Q 


33 




77 


-09 


83 


1 o 

-18 


oo 




Qft 


07 

U/ 


34 




79 


-08 


^ O A 

84 


-14 


91 


-17 
— 1 / 


^ QQ 


1 1 

11 


35 




75 


-09 


80 


o rt 

-20 


ft? 


-*to 


Q^ 




36 ■ 




81 


-18 


o n 

87 


o c 

-25 


QO 


-*tO 


OQ 

33 


\Uo 


37 


4 


77 


-08 


oo 

82 


OT 

-07 


7^ 


Id 


3\J 


Id 


38 




78 


-14 


a o . 

. 82 


-17 


ft? 
o/ 




Q7 


\ UD 


39 




78 


-20 


oo 

82 


O A 

-24 


ftft 
oo 


-17 
—1/ 


Q^ 

3\J 


1 1 
-11 


40 




78 


-25 


oo 

82 


O 1 

-31 


84 


-*i4 


Q7 


Oft 
uo 


41 




77 


-23 


O 1 
Ol 


-2b 


ftf) 




Q^ 

3\J 


^ Oft 


42 




77 


-23 


81 ' 


-25 


89 


-23 


] Q6 


Od 
u*t 


43 • 




79 


-32 


83 


-34 


ft? 
o/ 


-*to 


OQ 


u/ 


44 


1 


79 


-35 • 


,.84 


/31 


93 


-14 


98 


05 


45 




77 


-39 


80 


-35 


86 


-24 


97 


15 


46 




77 


-36 


80 


-37 


88 


r29 . 


98 


-04 

1 


4-7 




76 


-37 


82 


-28 


94 


-03 


97 


04 


48 




79 


-49 


83 


-46 


95 


-43 


99 


06 


49 




75 


. '-43 


84 


-41 


85 


' -71 


98 


06- 


50 




63 


-60 


81 


-37 


72 


-52 


80 


1.16 



( 



. ^ i ^v:- - 

Note. All values are presented without decimal points. 
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^ After the MDSCAL analysis was completed, the resulting two-dimensional 
configurations were plotted and the stress of the solutions were noted. Stress 
IS a measure of the deviation of the obtained distances between .the items in 
the MOSCAC solution from the distances present in the initial data. The value 
is standardized by the'squared deviation of all Of thajdistances from the mean 
distance. The smaller the stress, the better the fit bf the MDSCAL solution.. 

The results of the analysis of the coefficients applied tq the one-factor S' 
data indicated that three different types of soluti^s were being obtained for 
the data. Six of the coefficients (agreement, koppa, kappa, Lijphart, tau B, 
and phi) yielded plots that plajced the items along a straight line on one 
dimension, with the items ordered in difficulty— the easy items at one end and 
the difficult items at the other. The values of the stress index yaried from 
.048 to .029, with the kappa coefficient giving the smallest value. The reason 
for this pattern is that these coefficients are all affected by the difficulty 
of the test items, with items close together in difficulty being judged more 
similar. A plot of the MDSCAL result for the kappa coefficient islgiven in 
Figure 4. . . • ; / 

The second' type of solution was obtained for six of the coefficients ' 
(Yule's Q, Yule's Y, phi/phi max, gamma, tetrachoric, and eta). This solution 
resulted in a circular cluster of points. The position of the item^ within 
the cluster seemed to have no obvious relationship to the difficulty of the* 
items. The stress value for the solutions ranged from .34 to .33, with Yule's 
Q and gamma giving the smallest values. \his solution is a result of the fact 
that these coefficients are not affected by the difficulty of the item, and "7. 
therefore all pairs of items are found to be equally similar for one-dimensional 
data. The circular pattern is a result of trying to get all of the items equal 
distances apart in a two-dimensional space. Of course this cannot be done, so ^ 
the stress of the solutions are higher than those fgr the first set of coefficients 
An example of the circular solution for the gamma coefficient is given in Figure 
5. 

The third type of solution obtained from the MDSCAL procedure resul tedl from 
the application of the approval statistic. This solution had all of the eas^j/ 
items clustered tightly in the center, with the hard items spread out to oup 
side. -The pattern i§ a result of the way this statistic' is computed. J±lVs 
simply the proportion of times both items are answered correctly at the same 
time^ Thus> easy items are found to be more similar than hard items, which 
have fewer correct /responses . This solution had the lowest stress of all of 
the procedures, wiflh. a value of .021. The plot of this solution is given in 
Figure 6. ^ 

The next analysis run 'using the nonmetric mul tidimensioiVaVsc^l ing technique 
was the computation of the two-dimensional solution using each coeffvtient for 
the one-factor, rectangular distribution of difficulty data-set with A .25 
guessing 1 evel . The purpose of this analysis was to determine which coefficient' 
gave a' solution that was least affected by guessing* 
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FIGURE 4 

THO-OIHENSIONflL MDSCflL '"SOLUT ION 
FOR THE ONE-FflCTOR NO GUESSING OflTfl^ 
USING THE KflPPfl COEFFICIENT 
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FIGURE 5 

TWO-OIMENSIONflL MOSCflL S0LUTIK3N 
FOR THE ONE-FflCTOR NO^ GUESSING OflTfl 
USING The GflMMfl COEFFICIENT 
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FIGURE 6 ■ 

NwO DIMENSIONRL MDSCflL SOLUTION 
FOR THE ONE-FflCT0R NO GUESSING DflT^ 
USING THE flPPROVfll COEFFICIENT 
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The coefftcients that yielded linear plots for the no guessing case- gave 
two different types of plots for the .25 guessing case. The agreement; koppa 
and Lijphart coefficients resulted in wedge shaped plots for the two dimensional 
MDSCAL solutions, with the items high in guessing being in the wide part of the 
wedge. The stress was identical for all three coefficients-, with a value of .104.^^ 
Thi.$ was substantially higher t'han the .042 achieved for the no guessing case. 
FigiJIre 7 shows the plot of the results for the agreement score. Since these three 
coefficients gave identical results, only the agreement score will be given further 
f consideration. 

» 

The second type of plot obtained from the coefficients giving a linear 
plot for the no guessing data was' a crescent shape with the easiest and hardest 
items at the points of the crescent. < The Kendall's tau B, phi and kappa co-" 
efficients gave this type of pattern. The stress for these solutions ranged 
.from .137 to .158, up from .029 to ^048 whefT no guessing was present. Of these 
three coefficients, kappa resulted In the MDSCAL solution with the smallest 
stress value. The plot of the two-dimensional solution for the kappa coefficient 
^ is given in Figure 8. The gffect of guessing dn these coefficients seems to be 
an increased similarity in the very easy and very hard items, resulting in the 
curvature in the plots, ^ ^ 

The coefficients that resulted in circular patterns for the' one-dimensional 
data with no guessing also yielded two patterns when MDSCAL was- applied to, the 
one-dimensional data with .25 guessing, the Yules Q, Yule's Y, phi/phi max, , 
gamma, and tetrachoric coefficients all resulted in two-dimensional solutions 
that showed the original circular patterns distorted by pulling the items most 
affected by guessing down to the lower left. Guessing increased the distance 
between the items most affected by guessing, causing greater dispersion for 
those items. The stress values for the solutions ranged from ^219 to .270, with 
th^ tetrachoric correlation giving the smallest value, Phi/pHi max gave the. 
largest stress value and showed the greatest dispersion for the easy items* The 
distoV*tions caused by guessing brought about a reduction in the stress value 
from the .33 value obtained when no guessing was present. It seems that the 
guessing effect brings about a more linear continuum than was present previously, 
making the data easier to fit*. The plot of the two-dimensional MDSCAL solution 
for the tetrachoric correlations is presented in Figure 9, . % 



The second type of solution obtained from the set'o? coefficients that 
resulted in circular/ plots was obtained forJtbe eta coefficient, ^1n this case\ 
the plot remained circular, but the hard items migrated to the circumference 'of 
the two-dimensional structure, whil^ the easy items moved to the cfenter. The 
stress for this solution increased from ,330 for the no guessing solution to 
.355 for the guessing solution. Figure 10 presents the plot of this solution. 

The approval score, the coefticient that gave the third type of pattern 
for the no guessing data, resulted in* a pattern similar to that obtainedfor 
the eta coefficient when guessing was present. A 'circular pattern resuTti^h, 
with thp hard. items at the circumference and the easy items at the center. 
The center cluster was much tighter in this case, however. The stress of this 
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FI'GURE 7 



THO-OIHENSIONflL HOSCflL SOLUTION 
FOR THE ONE-FRCTOR .25 GUESSING DflTR' 
USINQ THE RPPROYfiL COEFFICIENT 
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solutlonSias much higher than the solution for the no guessing data, wtth a 
value of .239", compared^to the .021 obtained earlier. A plot of the' results 
for the approval score is presented ii^ Figure 11. 

One other coefficient was considered for use with the MDSCAL procedure, 
^at coefficient was the tetraqhorfc correlation corrected fo/ guessing. To 
check its usefulness, the tetrachoric correlations determined from the one- 
dimensidnal , .25 guessing data were corrected for guessing using .15, .25 and 
.35 guessing levels. The resulting coefficients were then analyzed using the 
MDSCAL procedure. The results fojr the .15 correction gave a pattern similar 
^to the uncorrected data, but with Slightly higher stress^( .234 vs. .'219). The 
" .25 correction resulted in a circular pattern similar to the no guessing data, 
but with hard items at one side of the plot of the solution. The stress WII5 
.320, almost as high as for the na guessing data (.334). 

The .35 correction resulted in a solution with the seven hardest items 
in,.^one group and all of the rest in another. The stress for this solution 
was^a very low .074. This solution was similar to the no guessing solution 
for the approval score. i 

. ! 

From the analysis, of the one^dimensional , ,25 guessing data, four different 
patterns of effects can be seen as a result of guessing. The coefficients that 
gave linear patterns when no guessing was present were either broadened into 
a wedge (agreement score) or bent into a crescent (kappa coefficient). The 
coefficients that gave circular' pi||^ns when no guessing was present were either 
stretched to one side by guessing ^^achoric correlation) or maintained a 
circular pattern, but with the hard items on the outside and easy items in the 
middle (eta coefficient). Carrol Ts correction for guessing did tend to com- 
pensate for guessing effects. However, the MDSCAL solution only matched the no 
guessing solution <f'the correction matched the true guessing level. Otherwise, 
the solution was distorted. 

Cluster Analysis As with the factor analysis and multidimensional 
scaling procedures, the first analysis performed with the two cluster. analysis 
procedures was the application of the techniques to one-dimensional data with 
no guessing. After the no gue^slhg analysis^ the, techniques were applied to 
the one-dimensional data set with .25 guessing to determine guessing effects, sj 
In all cases data-sets with rectangularly distributed traditional difficulty 
Indices were used to make clearer any ttem difficulty effects. All of the co- 
efficients listed previously were used for these analyses, along with both 
cluster analysis procedures, CLUSTER and HICLUSTER. The CLUSTER results will 
be presented first. ' 

The CLUSTER r^ults are difficult \o Interpret because the number of 
clusters obtained depends on the cutoff value used to accept an Item Into a 
cluster. Slight chang&s In the value resuT't In substantial changes 1n the 
number of clusters obtained. Despite these difficulties, a pattern was deter- 
mined- In the results. The cluster analysis solution determined for the kappa, 
phi, agre^ent, Lljphart, koppa, tau B, and approval coefficients were all 
related to the difficulty of the Items. That Is, Items of similar ^difficulty 
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?Mf!.c ? r ^9^^\^''-. ^" contras-t. the solutions based on Yule's Q. 
Tule s Y. eta. .tetrachonc, phi/pM max, and gamma tended to fo'rm ^inq^e 
rprL^M"'? clusters unrelated to item dif/iculty. This result is 

? H* '"'i: ^<>«^^f^c^ents all yielded coefficients that are 

fairly independent of i.tem difiicul ty. while the former set of coefficients 
?o'r S'?fM;^pr ^^^V^^^^^^^l^V, individual results will not be pr sin ed 
Z ^f^%?LUSTEF procedure since they are too dependent on the cutoff value 
number of cVus"ters! ' '""^ "° procedure is known to decide on the 

• The HICLUSTER pro^edu^e gave somewhat similar results to those of the 
ohfl'v P;:°"^'^r'-. hierarcMcial solutions developed for the garmia. phi/ 

aS e reLtWhf°';'%r'*H^Sl' S^' ^ 'oem^ents had no disc'ern- 

able relationship to item difficulty, while the solution for the Lijphart. phi. 

diff?;u??r""A!;inn^?S-^*i^?^P'' '"^ coefficients were related to item ' 

nf PnJr.J^; J^T 1^°""^ °^ coefficients, three distinct patterns 

knnL ^ Uetns into the clusters were noted. When using the Lijphart, 

koppa. and agreement coefficients the clustering procedure initially clustered 
the Items of, extreme difficulty and then worked In toward the more Moderate 
rw* J .V °" P^^» ^» I^^PP^ coefficients initially 

Vrnll l^ l^' ""i-^^-"' ^J^""' ^''^^^ the extremes? The approval 

d5?f?ruu ^^''^^^J^^^r'^ ^^.^ ^^^-"^ ^^'-^^^ toward the most 

nf Ho" l-^rJ ?! different patterns of results reflect differences in the effect 

Jlluef f^r I HH^^?!.5^'if 9^^^"^' °r coefficients. Some have the highest 
values for middle difficulty items, while others have the highest value for 
Items at the extremes of the difficulty range. As with the CLUSTER procedure, 
no procedure was known for determining the appropriate number of clusters, so 
no individual results will be presented here, ■ 

The application of the CLUSTER procedure to the one-dimensional. 25 
guessing data gave somewhat predictable results. For the kappa, phi . agreement 
Lijphart. koppa. and tau B coefficients, the clusters formed still had a ten- 
dency.to be related to the item dl'fficulty. but. the relationship was not as 
^hlfu . "^'-e clusters were formed than when no guessing was present, 

inis is a result of the reduced magnitude of the coefficients as a result of 
guessing. The results for the Yule's Q. Yule's Y. eta. tetrachoric. phi/phi 
max. and gamma coefficients changed somewhat from the no guessing case The 
clusters formed for the guessing data ha<l some relationship to the difficulty 
of the items. wJ^ere none was present when guessing was not present. Correcting 
the tetrachoric correlations for guessing at .any level did not remove this effect. 
The resu ts for the approval score were very similar to those presented for the 
no guessing data - the easy items\formed a large cluster, while many^mall 
clusters were formed from the more difficult items. 

^5!."!^^^^^^'^ procedure gave quite different results. The majority of 
the coefficients formed a hlerarchic^structure by grouping the easier items 
first, and then working down toward the hard items, the coefficients that 
presented this pattern were the gamma, tetrachoric, Lijphart. koppa. agreement, 
kappa, and approval coefficients. The Yule's Q. Yule-'s Y, phi, and tau B co- 
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efficients showed some of thts effect, but the resuUs were not as strong. The 
phi/ph1 max and eta caefftclents clustered the items in essentially the same 
way as when guessing was not present. When the tetrachoric correlation was 
corrected for gufessing at any level, the effects of item difficulty on the 
clustering was removed. • ' 

The analysis of the cksTER and HICLUSTER results indicate that different 
types of clusters are developed depending on the type of coefficient used. 
Some coefficients yiel(Kelusters related to item difficulty, while others do 
not. Guessing tends to force a relationship with item difficulty for both 
techniques for most of the coefficients. This will have to be taken into 
account when working with multi-dimensional data. 



Latent Trait 'Analysis The analysis of the one-dimensional, no-guessing 
data with the LOGIST program gave exactly the results that were expected. The 
three-parameter logistic a-parameter estimates were all uniformly high around 
a value of 2.0. The b-par^meter estimates were evenly spaced in the range from 
+3 to -3> and the c-parameters were all estimated as 0.0. These results were 



obtained by running the LOGIST program with the default program control values. 



Similar results to that obtained for the no-guessing data were also obtained 
when the simulated data contained a ♦OS, .15 or .26 guessing level, assuming 
multiple choice items with 4Tesponses. The a- and b-parameter* estimates gave 
results similar to,^hose described above, ^arvd the c-parameters were accurately 
estimated at the value used to generate the data. When the level of guessing 
used to generate the item data was above .25, however, the defaojlt options in 
the program were unable to accurately^es timate the parameters. With guessing 
at the .35 level,* th? c-parameters were underestimated for all but the hard 
items. The a-parameter estimates tended to be low for the moderate and hard 
items, and the b-parameter estimates were becoming more erratic. The parameter 
estimates for the .25 and .35 cases are presented in Table 7. The parameter 
estimates are progressively worse for guessing at the .45, .55, .55, and .75 
levels. 

The parameter estimates obtained from the LOGIST program, fcfr^fhe high 
guessing levelT could be improved by releasing the constraints on the c-para- 
meter. When the range of acceptable c-values was made larger the program did 
a good job of estimating the parameters at the .35 and .45 levels. Parameter 
estimates for higher guessing Tevels were still inaccurate. 

In evaluating the results of these analyses, it is clear that LOGIST 
program does well when the guessing levels are low to moderate, and poorly 
when guessing is high. These results should be taken as very favorable overall, 
since it is unVikely that guessing on typical multiple choice items is ever as 
high as .65 or .75. That is, subjects wi^jh ability at -« are unlikely to have 
that high a probability of obtaining a correct response to an item. When the 
guessing level is reasonable, the program does a very accurate job of estimating 
the parameters. 
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Table 7 

Item Parameter Estimates for the One-Dimensional 
with .25 and .35 Guessing Levels 
and Rectfeingular Distribution of Difficulty 
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Sumrnary The pul^pose of this section hasten to report the results 
of the four techniques considered in this report — factor analysis, non- 
metric multidimensional scaling, cluster analysis, and latent trait analysis--^ 
to one-dimensional data to serve as a, frame of reference for the analysis of 
multidimensional data. The factor analysis, multidimensional scaling, and 
latent trait analysis gave a clear indication of the one-dimensional nature 
of the data when no guessing was present. When guessing was present the dis^ 
torting effect could be seen in the results of each of the techniques. The 
precent of variance in the first factor was reduced for the factor analysis 
^technique, along with reduced first fac^tor loadings and the presence of extra 
guessing factors. The two-dimensional representations of the MDSCAL results 
were stretched or bent by the guessing effect, ajjd LOGIST parameter estimates 
were less accurate when high guessing was present (.35 and, above). 

•5- 

The results of the two cluster analysis procejJures were harder to interpret 
in that it was hard to decide how many clusters were in the data. One consistent 
finding was that guessing was found to make the solutions more dependent on item 
difficulty. The problem with the determination of the number of clusters seems 
to make this technique less useful for forming unidimensional subsets. 

Each of the above techniques w^s 'applied to two-dimensional data to 
determine how well the items could be sorted into unidimensional sets. Only 
techniques judged to perform this sorting task well were used in later analyses. 

Two-Dimensional Simulated Data 

The results of the application of the four techniques to the two-dimen- 
sional data will be presented in the same order as in the previous section: 
factor analysis, multidimensional scaling, cluster analysis, and latent tratt^ 
analysis. Three two-dimensional sfmiilated data-sets were subjected to analysis: 

(a) a data-set with a rectangular distribution of difficulty and no guessing; 

(b) a data-set with abnormal distribution ^f difficulty and normally distributed 
guessing around .20; and (c) a data-set with rectangularly distributed item 
difficulty and constant guessing alj •25,. These three data sets were selected 

to vary the difficulty of the sorting task and the realistic nature of the data. 
All data-sets had 50 items, 1000 cases, and loadings for each item of .90 on 
one factor and .00 oh the other. The factor loading matrix used to generate 
« the data is given in Table 8. 

Factor Analy34.s For each of the data-sets, six factor analyses were 
possibTe^ These included the analyses using either the principal component 
or principal factor method on phi, tetrachoric, or corrected tetrachoricAorrela- 
tions. In some cases, maximum likelihood factor analysis was also run on the 
data, -V 

The simplest of the three data-sets containing two factors had a rect- 
angular distribution of difficulties and no guessing effect. Of the six 
possible analyses, the principal factor analysis on phi coefficients gave the 
best overall results. From this analysis, it was easy to identify the items 
generated from each factor, and the eigenvalues indicated two major factors and 
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. Table 8 " 

Factor Loading Used to Generate 
the Two-Factor and Three-Factor Simulated Data 
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i!2l£' All factor loadings are presented without decimal points. 



and two minor were present in the data. The factor loadings resulting from 
this analysis are shown in Table 9. All of the other analyses performed on 
this data-set yielded factor loading matrices that did not clearly identify 
the items in each factor, or that indicated that too many factors were present. 

The analysis of the data-set with rectangularly distributed difficulty 
and guessing set at .25 gave quite a different result. The principal component 
analysis of the tetrachoric correlations corrected for guessing at the .25 level 
gave the most accurate classification of items into the factors, but also yielded 
a solution with 12 eigd^alues greater ''than 1.0. The first two eigenvalues were 
clearly larger than th^est^ however. Unfortunately, the results cannot usually 
be expected to be as good. A problem with this procedure is that the level of 
guessing on the test items is seldom accurately known. The principal factor 
approach did almost as well in correctly indicating the factor used to generate 
the 4tems and showed many fewer factors present in the data (four eigenvalues 
greater than 1). Therefore, the principal factor approach with phi coefficients 
was considered the best procedure for use with this data. The factor loadiag 
matrix for the first two factors of the solution is also given in Table 9. Note 
the reduction in the magnitude of the factor loadings with increased guesstng 
and with the extremity of the proportion correct on the items, 
• 

The two fiata-sets described above aheliot very req^Hstic because tests 
seldom have rectangular distributions of difficulty or constant guessing. 
Therefore, a two-dimensional data-set with normally distributed item difficul- 
ties and normally distributed guessing levels was also analyzed. The results 
pf the analysis of these data were uniformly good for all of the techniques. 
V(\ techniques gave information that allowed the items on each factor to be 
clearly identified. The only difference appeared in the number of factors 
indicated in the data. The principal factor analysis of phi coefficients was 
the only technique that accurately indfce^ted that two factors were present. / 
This fact, and the good showing for the other data-sets, seems to indicate 
that it is the technique of choice for the two-dimensional data. The results 
for this technique for the normally distributed data-sat are also given in 
Table 9. Note that for all of the phi 'coefficient analyses reported the load- 
ings are much lower than those used to generate the data. The results of the 
analysis of the tetrachoric correlation^ more closely appmximate the magnitude 
of the loadings used to generate the data, but theyliou^^ot b€* use^ to classify 
the items as accurately. 

Nonmetric Multidimensional Scaling The nonmetric multidimensional 
scaling procedure was applied to the same three two-dimensional data-sets 
used with the factor analysis procedure: two dimensions, rectangular diffi- 
culty, no guessing; two dimensions; rectangular difficulty, constant .25 
Qjjessing*; and two-dimensions, normal difficultly, normal .20 guessing. The 
results of the analysis of these data-sets will be reported in the order given 
above. 

The MOSCAL program was run on the two-dimensional, rectangular difficulty, 
no guessing (S0250R.CG00) data-set using 11 similarity coefficients. Jhe kappa 
and Lijphart coefficients were deleted since they give identical results to the 
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Table 9 

Factor Loading Matrices from the Analysis 
of Tlrree Two- Dimensional Data-sets 
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Data-set / Technique 
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Note, WMl factor loadings are presented without decimal points, 
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agreement coefficient for dichotomous data. Of the remaining coefficients, 
those that gave a linear pattern previously gave 'two solutions for the two- 
dimensional data. Thfe agreement coefficient yielded an oval shaped solution . 
V>ee Figure 12) and the kappa, phi, and tau B coefficients resulted in configura- 
Vi^ll 11 cJ' "^M"'^ distributed along two roughly parallel 

k i \ . "9"''^ agreement coefficient based solution could not 

bewsed to separate t^ie Items into unldimenslonal sets it was dropped from 
further consideration. The other three "linear" coefficients could be used 
equally well to separate items into .homogeneous sets, although the phi and tau B 

coe1Jjc]en"u^061 5sl"!lSir^*^ * * ^'^^^ 

Of the six coefficients that gave a circular solution .for the one- factor 
data, five gav^ a solution for the two- factor, no guessing data that sorted the 
items into two distinct, .tight clusters. The five coefficients were: 'gamma,, 
phi over phi max. Yule's Q, Yule's Y, and tetrachoric. The results for Yule's Y 
IS presented ip Figure 14. Any of these five coefficients could be used to -sort 
the Items into homogeneous sets. 

The sixth "circular" coefficient was the eta coefficient. The solution 
obtained using this coefficient could also be used to sort the items into homo- 
geneous sets, but the resulting plot had more spread and had a higher stress 
value than the previous coefficients (.200 vs. .096 and .104). Figure 15 shows 
a plot of this solution. 

The remaining coefficient applied to this data set was the approval 
score Figure 16 shows, a plot of the MDSCAL solution using this similarity 
coefficient. It gave a butterfly shaped pattern with the easiest items In 
the middle. Because of the closeness of the points representing the easy 
Items from different factors in this solution, it may not yield a result that 
Is useful for sorting items into homogeneous sets. The stress value for the ' 
solution was .117. 

When guessing was added to the parameters used to generate the two factor 
data, the results were only slightly different for the "1 inear"" coefficients. 
The linear sets of points had somewhat greater spread for the hard items, but 
the two dimensions were stil 1 'clearly recognizable. Figure 17 shows the results 
for the tau B coefficient. 

The "circular" coefficients were affected somewhat more than the "linear" 
coefficients. The ti-ght clusters of points found when there was no guessing 
effect were sprea<ljj(iite dramatically, showing the effect of guessing. The 
Results for Yule's Y are shown in Fi,gure 18, demonstrating this, effect."" 

The scatter in the solutions obtained using the eta and approval coefficients 
increased with the presence of guessing td the point where the separate subsets ^ 
of items were no longer readily identified. These two coefficients were therefore 
dropped from furth'er consideration. 
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TWO-DIMENSIONAL MOaCPL SOLUTION 
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•FIGURE 16 • 

THO-DlMENSIONflL MDSCRL SOLUTlOf^ 
FOR THE TWO-FflCTOR .25 GUESSING DRTfl 
USING THE RPPROVfiL SCORE 
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One final analysis was performed on this data-set. The MDSCAL program 
was applied to the matrix of tetrachoric correlations corrected for guessing 
at the .25 level. The resulting plot of the solution was somewhat clearer 
than that for the uncorrected tetrachoric correlations, but the stress increased 
froip .146 to .174. 

After deleting the agreement, approval, and eta coefficients from consid- 
eration because they gave ambiguous results, eight coefficients remained. 
These eight were computed on the two-dimensional data-set wjth normal difficulty 
and normal guessing (SD250N,NG20). The results were uniformly good, looking 
approximately like Figure 14, Because of their similarity, individual results 
will not be presented. 

The results of the analysis of the two factor data-sets show that the 
MDSCAL program applied to any of kappa, phi, tau B, gamma, Yule*s Q, Yule's Y, 
tetrachoric, or corrected tetrachoric coefficients yielded srolutions capable 
of sorting the items into the factors used to generate them. Of this set of 
coefficients, XendalTs tau B gave 'the solution with the lowest stress value. 

Cluster Analysis Both the CLUSTER and HICLUSTER programs were applied 
to the three data-sets analyzed by the factor analysis and nonmetric multi- • 
dimension::jl:rccaling procedure6. ■ The restilts obtained froz* the application of 
these two techniques to the data-s*ts were generally disappointing. While the 
factor analysis and multidimensional scaling procedures could accurately classify 
the items into the correct factor, in no case, regardless of the coefficient used 
could the cluster analysis procedure do so. This poor showing occurred despite 
the fact that a two cluster solution was assumed in advance. If the number of 
clusters present in the data had not been known, the results would have been 
much worse, since no reasonable criterion was known for determining the number 
of clusters. 

To demonstrate the poor quality of the information obtained from the 
cluster analysis procedures, the number of misclassi fied items based on the 
analyses of the data using 10 different coefficients is shown in Table 10. 
These results were based on a two cluster solution using the HICLUSTER procedure, 
-and the closest to a two cluster solution that could be obtained from the CLUSTER 
procedure by varying the criterion for entering a cluster. The results are 
presented for the SD250R.CG25 data-set. The results for the data-set with a 
normal distribution of item difficulties were substantially better, with few 
e'rrors in classification, but the itemsjn that data-set have been shown to be 
very easy to classify using the other procedures. | 

As can be seen from Table 10, many of the items were placed in clusters 
defined by items from the other factor. The agreement and approval scores 
yielded particularly bad results for the CLUSTER procedure becairse. many different 
clusters were formed, none of which conformed to the structure used to generate 
the data. Ironically, the approval score, which gave the worst results for the 
CLUSTER program, gave the best results for the HICLUSTER program. Because of the 
erratic afld' often poor results obtained from the cluster analysis procedures. 
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Table 10 



Errors. in Classification of Items Onto Dimensions 
for the CLUSTER and HICLUSTER programs 
Using a Variety of Coefficients. 



f. . ^ Program 

Coefficient ^ 

CLUSTER* . , HICLUSTER 



Agreenent 24 21 

Approval 37 



tta 
Gamma 

Liiphart 24 
Phi 
Tau b 

Tetrachoric '7 
Yule's Q 7 
Yule's Y 14 



33 10 
7 8 

21 

16 22 
16 21 

.8 
8- 
8 



Sct^?h°at'f Wf^ "'^'"9 ^'-'J^TER were due to the 

tact that a two cluster solution-could not be obtained. 



they were removed from further consideration as item sorting techniques. 

Latent Trait Analysis The application of the LOGIST program to the three 
two-factor data-sets gave good results for the no guessing and the normally 
distrtbuted .20 guessing case, and fairly good results for the two factor data 
with rectangular difficulties and .25 guessing. In the former ^wo cases, the 
items generated from one factor had uniformly high discriminatioK parameter 
estimates while those from the other were uniformly low. Items cduld be cor- 
rectly classified 100% of the time. In the latter case, six iterations of the 
program, deleting low discriminating Items after each iteration, were required 
In order to get a set of items that had uniformly high discrimination parameter 
estimates. Only one Item of the 25 items retained came from the alternate 
^ , factor. Unfortunately, the six iterations required about nine minutes of CPU 
time, compared to about 30 seconds for factor analysis. Unless the number of 
iterations needed to form the homogeneous item sets can be kept to a small 
number, this procedure may be prohibitively expensive'. 

Three-Dimensional Simulated Data The three dimensional data-set generated 
I'o*' this study was produced to match what was considered to be a reasonable 
model of real test data. This data-set had a general first factor with .5 
loadings for each item. The second and third factors were bipolar, with half 
of the items having .5 loadings on one of the factors and half on the other. 
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The factor loading matrt7rtmd-ta generate the data is presented in Table 8. 
The .5 loadings used for this data-set were thought to be much more reasonable 
than the .9 loadings used for the previous data-sets. 

Three procedures were applied to this data-set: factor analysis, multi- 
dimensional scaling, and latent trait analysis. As mentioned earlier, the 
cluster analysis procedure was dropped from consideration because it could not 
be used to sort items into homogeneous sets. The factor analysis results will 
be presented first. ^ 

Factor Analysis All six of the \ctor analysis solutions described for 
the two factor data were obtained for this data-set. These included the prin- 
cipal component and principal factor solutions on phi coefficients, tetrachoric 
correlations, and tetrachoric correlations corrected for guessing. Of these, 
the ^irlalysis of the tetrachoric correlations corrected for guessing clearly 
did not give a good representation of the structure used to generate the data. 
The principal factor solution could not be obtained at all, and the principal 
component solution did not give meaningful factors. This is probably due to 
the fatt that the tetjrachoric correlations were corrected for constant guessing 
at a .20 level, while the guessing level in the data varied substantially around 
.20, Thus, for many of the items the procedure over corrected for guessing. 
These results indicate that correcting for guessing is not a reasonable procedure 
with realistic data where the true guessing level of the items is unknown. 

Of the other solutions^ the principal component solution on the tetrachoric 
correlations, and the principal factor solution on phi coefficients gave the 
best results. The varimax rotation of the principal factor solution on phi 
coefficients was especially accurate, correctly ciys'srfying all of the items. 
This solution is presented in Table 11. Note 'that in this solution separate 
factors were defined by the positive and negative ends of the factors used to 
generate the data. The good results obtained for the analysis of phi coefficients 
reinforces the results obtained on the other simulated data-sets, possibly in- 
dicating that the principal factor, techniques on phi coefficients should be 
used for item sorting with real test data. 

Nonmetric Multidimensional Scaling The nonmetric multidimensional scaling 
analysis of the three-dimensional data using the eight coefficients selected 
on the basis of the previous analyses gave uniformly good results. In all cases 
except when the tetrachoric correlations were over corrected for guessing at^the 
.25 level, every item could be correctly classified onto the appropriate factor. 
As with the factor analysis, the items from the opposite ends of the bipolar 
factors were put into separate clusters. Those from the same factor were at 
opposite ends of the diagram in a two-dimensional plot. 

Although all eight coefficients could be used to accurately sort the 
items Into homogeneous sets> there Were slight differences in the stress Of 
the solutions. Stress values ranged from J14 to .125^ with Yule's Y and the 
tetrachoric correlations giving the smallest values.. The two-dimensional 
MDSCAL solution for Yule's Y Is given In Figure 19. 
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Table 11 

Factor Loading Matrix from the Varimax Rotation 
of the Four Factor Principal Factor Solution . 
on the Three Dimensional Data-Set. 
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Latent Tra it Analysis The analysis of the three factor data with the 
LOGIST program resulted in an accurate classification of the items onto the 
respective factors. The discrtmtnatton parameter estimates for the iteiis from 
one end of a single bipolar factor all had uniformly high values of around 
1-0, while the rest of the items had parameter estimates of ,30 or Tess- As 
with the previous analyses, each end of the bipolar factors defined separate 
sets of items. To obtain the complete sorting of the items, three runs of the 
program were required. 

Summary . All three procedures used to analyze the three factor data-set 
resulted in solutions that>fcould be used to form homogeneous item sets. The 
factor analysis proced^i»<^^efined clear sets of items using the principal 
component jiyitediu^e-IJnte^ correlations and iTie principal factor pro- 

cedure orf^phi coefficients. The MDSCAL program gave clear solutions using 
the phi, tau B, kappa, tetrachoric, corrected tetraohorlc. Yule's Y, Yule'jTQ 
and gamma coefficients. Only when the tetrachoric correlations \yere corrected 
at too high a level did the procedure degenerate. A similar finding was 
observed with the factor analysis procedures. The LOGIST analysis of the clata 
also gave accurate sortings of the items, but three program runs were required 
to 'sort all of the items. The results of the application to a more realistic 
nine factor data-set will now be reported. 

Nine Fac tor Simulated Data ' 

' ^ 

The nine factor simulated data-set was the most realistic of the Simula-' 
tion rfata-sets produced. Its characteristics were designed to match those of 
.an actual achievement test measuring nine content areas. This data-set had a 
general factor and eight group factors, the last one beinjg bipolar. The major 
loadings on the first eight factors were all positive, reflecting the structure 
seen in most achievement tests. The factor loading matrix used to produce this 
data-set is given in Table 12. ^ 

The results of the analysis of this data-set using factor analytic tech- 
niques are similar to those obtained for the three factor data-set. Both 
the principal cpmponeftts analysis on tetrachoric correlations and the principal 
factor analys/s on phi coefficients gave results that were easily used to sort 
the items into homogeneous groups. As an example of these results, the varimax 
rotated principal factor solution is shown in Table 13. Notice that no general 
factor is present in this solution. The general factor was present in the 
initial principal factor solution, but was rotated out with the varimax rotation. 

Nonmetric Multidimensional Scaling^ The application of the MDSCAL program 
to the complex nine factor data-set using the/^ight coefficients selected on the 
basis of the previous analyses gave generally/ good results. Only the tetrachoric 
correlations corrected for guessing gave poor results. The problem with that 
coefficient again seemed to be over correcting for guessing due to the fact that 
the guessing level for individual items was unknown. The other seven coefficients 
gave good results, with Yule^s Y, Yule's Q, gamma, and the tetrachoric correlation 
having slightly higher stress values than the phi, tau B and kappa coefficients. 
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Tatjle 12 

Factor Loading Matrix Used to Generate 
•the Nine-Factor Data-Set. 
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Factor l^oadings from the Varimax Rotation 
Of the Principal Factor Analysis 
of the Nine-Factor Data -Set 
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'^lVcV!nIJi?^il^!^^^^\'^''r^'' dimensional plots. The plot for the 
Lat tZ^f Jl ^'ll \" ''^^9"'^ 20. Examination of the plot will show 

clusters.^ data-seyave been divided into nine clearly distinguished 

• Latent Trait Analysis The LOGIST analysis of the nine factor data-set 

alel Is'Eimale "^h'""^''- '"''^'^K^^^ and c-param.ter values wer cV 
^PinnMn I • l^-^ a-parameters- gave very little indication of the items 

4 U^nJ ^R?° ^PJ''^^^"^ar factor. Jhe a-parameter estimates variexJ between 
^hi' ^niti.}*,"; "° noticable relationship tcAthe factor structure. Despite 
the initial ambiguous look of the results, a homogeneous set a|-^x items could 
be obtained by running the program eight times, deleting the -^-w th the 
lowest a-value estimates after each run, Since such a proceduWs clearly 
impractical, the LOGIST program does not seen>*to be a viab.le procedure for form- 
ing unidimensional item sets. * . h urc rur Torm 

simui.SS l^J arralysis of the one-, two-, three-, and nine-dimensional 

simulated data-sets, the factor analysis and multidimensional scaling procedures 

seem most useful for sorting items into undimensional item sets. Of the factor 

^Jh nHnrfn^f'^"'!'' pnncipal component analysis. of tetrachoric correlations 

and principal factor analy|iij,of phi coefficients gave the best ^-esults. • 

n.f J/ I^Ja t^^^l "^ttM^?^^ "5^^ w't^i the MDSCAL program, those that are 
not affected by item difficulty seem to give a slightly better sorting of the 
Items than^ those coefficients that are affected by'item difficulty These . ' 
coefficients include Yule's Y. Yule's Q. gamma, and the tetrachor1c(correlation. 
Of these. Yule'sy^seems to be a good choice for formftTg- i tWi sets Lause of 
us ease of com^t^tion and clear solutions for the simulation data.J 

^ The results for, the latent tra-it analysis approach indicatedthat the *" 
py&ced/jre IS too t#me consuming for regular use as an item sorting procedure. 
Although the procedure can be usgd to get homogeneous item sets, it requires 
prSLdSJes^^^ analyses and uses substantially more computer Wme than the other 

J^^/''"^^^'' analysis procedures used on the simpler data-sets were found 
to be inadequate for item sorting. No reasonable means could be found to 
determine the number of clusters. ?nd the clusters that were formed often con- 
tained only some of the items generated from the |ame factor. ^ 

" On the basis of these results, only the principal comfwnent analysis of 
SnJ?2. ? correlations, principal factor analysis of phi coefficients. "and 
MDSCAL analysis of the seven coefficients mentioned above can be recommended 
for forming sets of items compatible with IRT methods. These techniques were 
applied. to the test dat^ from the Iowa Tests of Educatioral -Development as a 
final evaluation of their capabilities. 

ITED Data 

The if^D data-set was produced by randomly sampling 33 items from the 69 
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FIGURE 20 

TWO-DIMENSIONRL HOSCRl SOLUTION 
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Items in the Expression subtest and 17 items from the 36 items in the Quantita- 
tive subtest of Form'-Y-e to form a 50 item test that should have had two rela-. 
tively distinct components. For ease of analysis, the 33 verbal Items were 
placed first in the test, followed by the 17 quantitative items. For these 
^ items, responses for 1000 examinees were sampled from the responses of 4000 
examinees who took the test during the 1975-1976 school year. The examinees 
were equally divided among Grades 9, 10, 11, and 12, By producing the data-set 
in this way, it wa^ hopedl that a real data-set of known structure would be 
developed. 

Factor Analysis The results of the varimax rotation of the principal 
factor solution of phi coeffici%its for the ITED data-set are presented in 
Table 14. The results of the principal component solution for tetrachoric 
correlations were similar and will notrbe shown. As can be seen from the^table, 
, two major factors are present in the data. Factor I is composed of most of the 
items from 4 to 23, which are all verbal comprehension items. Factor II is 
composed of most of the items from 34 to 50, which are all quan^:i tative items. 
Only 17 of the 50 items irt the test do not load on these two factors. Of these, 
six were spelling items that were mistakenly included with the verbal compre- 
hension items (Items 28-33). These results show a relatively clear sorting 

fof the test items ioto homogeneous content areas. 

I 

Nopmetric Multidimensional Scaling The results of the application of 
of the MDSCAL program to the inter-item similarities obtained from the seven 
coefficients retained up to this poiat were much the same, with stress values 
only ranging from .252 to .255 and little variation in the two-dimensional plots 
of the items. Figure 21 shows a representative .two* dimensional plot of the 
interrelationships of the 50 items based on Yule^s Y coefficient. The initial 
impression obtained from this plot was that there was no clear separation of 
items into the different content areas. Without knowing which items were verbal 
and which were quantitative*, this procedure could not give information for 
accurately sorting the items by type. Examination of higher dimension solutions 
also gave no clear results. 

The use of J<nowledge concerning the content area measured by each item 
gives a more positive interpretation to these j^ta. Items 34 to 50 are all 
quantitative items. The MDSCAL analysis resulted in a two-dimensional repre- 
sentation that placed all of these items in close proximity in the right side- 
of-the plot. The resu>ts of the procedure actually produced a fairly distinct 
separatioil of these items from the verbal iten)s. Unfortunately, this pattern 
was very difficult to distinguish without previous knowledge of the structure. 
* o^f the test. For these data, at least, the factor analysis procedure gave 
information that was more useful for sorting items into unldimensional sets. ) 



Discujsslon 

'The purpose of this report has been to Investigate techniques for forming 
sets of test items that meet the assumptions of most latent trait models. That 
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Table 14 

'Varimax Factor Loading Matrix 
, from the Princtpal Factor* Solution of Interitem Phi Coefficients 

.* ' for the' I TED Data 
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FIGURE 21 
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IS. procedures were evaluated for sorting items into sets that measured a 
single latent trait. The investigation of this problem was performed using 
th»-ee approaches. First, a theoretical model of guessing based on the "know- 
ledge or random guessing" principle was produced and some theoretical results 
were determined. Although this model is clearly not a correct reflection of 
the way individuals really interact with test items, it was hoped that some ■ 
insights into the effects of guessing on the observed dimensionality of item* 
sets would be obtained. 

The second approach taken in this investigation was to generate simulated 
test data according to the theoretical model produced In the first part of the 
^study and use that data to evaluate factor analysis, cluster analysis, nonmetric 
multidimensional scaling, and latent trait analysis on their ability to form 
Item sets measuring a single dimension. Data-sets with various numbers of 
factors were produced for this/urpose. and the amount of guessing affecting 
the Items was varied. Since tile true structure of these data-sets was known, 
the quality of the results obtained from the four techniques considered was 
easy to evaluate. 

The third approach taken in this research was to produce a data-set of ^ 
known structure from existing response data on subtests of the Iowa Tests of 
Educational Development, and to attempt to recover-, that structure using the 
four techniques mentioned abovfe. The data-set produced contained quantitative 
and verbal items, which logically should have resulted in two homogeneous sub- 
sets of it?ffls. This approach was included in Hhe study since simulation data 
never really does an adequate job of modeling the interaction of .examinees with 
test items. This "real" test data-set was the most stringent test of the pro- 
cedures . 

The results of the research reported here often matched what would be 
expected on the basis of a logical analysis. of guessing and dimensionality 
effects, but sometimes unanticipated results were obtained. For example, the 
theoretical model predicted that, as guessing increased, the" proportion of 
variance accounted, for by the major factor in a test would decrease. This re- 
sult was expected and was supported by the analysis of the simulation data. 
The review bf the literature also suggested such a relationship. However, it 
was unexpected that an interaction would be found between the level of guessing 
and the saturation of ,an item ^ith the major component on a test. Highly dis- 
^cr-lminating items were found to be more affected by low levels of guessing 
than low discriminating items, while the reverse was true for high 
levels of guessing (above .25). Since most multiple-choice items have average 
guessing leyels below .2&, this implies that guessing is a more serious problem 
for good items. This finding had not been seen in-the research literature 
previously. V 

It is Interesting to note that the theoretical predictions concerning - 
guessing, including those presented in this paper, are not all consistent with 
each ^ther. The results obtained by Plumlee (1952), Carroll (1945), Mattson 
(1965), and Denny and Remmers (1940) certainly are not consistent, ""and the 
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resuUs presented here do riot agree with any of these. This multiplicity 

of results reflects the complexity of the guessing phenomenon and the numerous 

approaches taken to modeling guessing. 

The results of the analysis of the simulation data were consistent with 
the theoretical predictions from the models. With increased guessing, the pro- 
portion of variance accounted for by the first factor in a test decreased, and 
"guessinrg" factoris appeared. Also, the raa^itude of the loading of Individual 
items decreased v*Uh increased guessing, and the effect was stronger for the 
more difficult/Items. All of these results were expected. What was not expected 
was that tests with rectangular distributions of traditional Item difficulty 
were required to make ttiese effects clearly evident. With more realistic^ normal 
distributions of item difficulty, the guessing effects were much smaller. This 
suggests that guessing effects may not be too serious a problem in actual 
testing settings when the item difficulty is not too extreme. 

The use of nonmetric multidimensional scaling, cluster analysis, and 
latent trait analysis had not been se'Bn previously in the literature, so much 
of the results obtained was unanticipated. The two major kinds of MDSCAL plots 
for the one dimensional d^ta, linear and circular, were unexpected, but further 
analyses showed that they were a' fynction of the effect of item difficulty on 
the magnitude of the similarity coefficients. The linear plots indicate a 
difficulty effect, while the circular plots indicate that item difficulty has 
little effect. 

^% 

When beginning this research, 'it was hoped that cluster analysis or latent 
trait analysiPwould serve as an altern^itive to factor analysis as a technique 
for purifying item pool,s. Unfortunately, the results of this research indicated 
that this hope was unjustified. Cluster analysis s^ems to be unsuited for this 
purpose. Possi-bly^ as the research on cluster analysis progresses, better 
guidelines will become available for determining the numbar of clusters present 
in the test data, and the procedure will, as a result, become more useful. 
Currently, it cannot be recommended for this-t^se. 

Latent trait analysis, the repeated application of the LOGIST program, 
did perform the item sorting task well, but in 'a very cumbersome and expensive 
manner. For these reasons, itN:annot be recommended. 

The mul tidimensional , scaling technique applied in this research did live 
up to expectations. For all of the simulat'lon data-sets the procedure presented 
ipformation that could be used to identify the unifactor item sets when used 
with the phi, kappa, tau B,* tetrachoric, gamma, Yule*s Q, or Yule's Y coefficients. 
Unfortunately, the results were not as good for the real test data. The quantita- 
tive items were well 'clustered, but it.was hard to distinguish between the 
quantitative cluster and the yerbfk Items. Perhaps with further research better 
results can be obtained W4th real test data. The results do emphasize the fact 
that simulation data- are hot a good substitute^ for real data. 
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The procedures that performed best of all those studied were the 
principal component analysis of tetrachoric correlations and the principal ^ 
factor analysis of phi coefficients.) The interpretation of the factor analysis 
results was nat as clear as for the MpSCAL results when simulation data w^re 
used, but was more clear for the reaydata. This was true even for the factor 
analysis of phi coefficients, which are supposed to be plagued by difficulty 
effects. Difficulty and guessing factors were noted when items of extreme 
difficulty were used, but these factors -were not found for the more realistic 
data-sets . 

The factor Analysis of tetrachoric correlations worked well when the 
principal components technique was used, but not when the principal factor 
technique was used. The reason for this may be the added effect of the in- 
stability of the tetrachoric correlations when iterative estimates of communal- 
ties were made. These problems with the estimates of the tetrachoric correla- 
tions were most severe for extremely easy or difficult items. 

. The en^ results of the research reported here is that the traditional 
factor analysis procedure seems to perform the best of the techniques investi- 
gated for identifying items that form unidimensional item sets. The nonmetric 
multidimensional scaling procedure worked well for the simulation data, but the 
results for the real data were ambiquous. Because the study may have been 
biased in favor of the factor analytic procedures, due to the fact that a linear 
model was used to generate the simulation data, the real data-set analyses were 
the key to the chdice of a procedure. The best procedure when using this data- 
set was the factor analysis procedure. 



Summary and;- Concl usions 



The effects of guessing on techniques for* sorting items into sets that 

measure the same single dimension were detemiined using theoretical, simulation, 

and real data analyses. The theoretical results showed that, as guessing 

increased, the percent of variance accounted for by the major component of a 
test decreased. Guessing was also found to affect highly discriminating items 

ntore than low discriminating items. The results of the theoretical analyses 

presented here did not match those presented previously in the Hiterature. 

The factor analysis of simulated data-sets foirrored the theoretical 
results. As guessing increased, the proportion of variance accounted for by 
the first factor of th^ test declined. The magnitude of the loading of the 
items on the factor was also reduced. This effect was strongest for the most 
difficult items. When items of extreme difficulty were present in the simulated 
tests, ^guessing and difficulty factors were found. 

The application of the nonmetric multidimensional scaling procedure to 
the simulation data gave different results depending on the^simllarity coefficient 
that was used. If the coefficient were affected by the difficulty of the items. 
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a linear pattern was found whefi the solution was plotted. If the similarity 
coefficient were not affected by the difficulty of the items, a circular 
plot was obtained. Guessing distorted these patterns, but the MDSCAL procedure 
siill separated the items into homogeneous sets with little difficulty when 
simulation data were used. The procedure did not give adequate results for 
real data. 

Cluster analysis and latent trait analysis were not found to be useful 
for sorting items into unidimensional sets. The cluster analysis procedure 
tended to give too many small clusters, and no way was known for combining 
them into larger clusters that corresponded to the known structure of the data. 
Repeated application of the LOGIST program to sort the items into undimensional 
sets worked, but was too expensive and cumbersome. 

Of the procedures studied, the principal component analysis of tetrachoric 
correlations and the principal factor analysis of ph1 coefficients gave the most 
consistently positive results. Until a better method can be found, these time 
honored procedures should continue to be ysed to form unidimensional tests. 
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APPENDIX A 

Similarity Coefficients 



Many of the coefficients used In this study are based on the responses 
of two items as summarized^in a 2x2 or 3x3 contingency table. For consistency 
the iFirst 10 coefficients will be described using the following 2x2 table 
arrangement: , 



Item i 



Item j" 





0 


1 




0 


a 


b 


a+b 


\ 


c 


d 


c+d 




a+c 


b+d 


N^a+b+c+d 
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where a, b, c, and d are cell fred|uenc1es and N is the total number of 
examinees. 

Agreement Coefficient 

The agreement coefficient (Weisberg» 1968) is the proportion of examinees 
responding in the same way to both item^s, and is given by: 

r - a+d . 



T 



Approval 'Score 



The approval score (Weisberg, 1968) is the proportion of examinees passing 
both items, and Is given by: 



Eta Coefficient 



The eta coefficient (Weisberg, 1968) is a measure of any type of association 
between two variables. It Is usually used to determine the association between 
a nominal variable and an interval variable. This eta coefficient Is in no way 



Si) 
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related to the eta coefficient used In analysis of varlahce procedures This 
eta Is given by the follovdng formulae^ , 

1) if ad > be and b > c, 

n = (ad - be) (b - c) 
^ (c + d)(c + a){b + c)' 

2) if ad > be and b <' e • *^ 

n = (ad - be) (e - b) 
^ (b + d) (b + a) (e + b)' 

3) if ad < be and d > a, 

= (ad - be) (d - a) 
^ (a + e)(a + b)(a +d)-' 

and 4) if ad < be and d < a , 

n = (^d - be) (a ^ d) 

' Kappa Coeffleient ^ — 

r " ^ 

The kappa coeffieient (Coherf, 1968) is essentially aa agreement seore 
eorrected for ehanee agreement, and is given by: 

u _ (a +^d) - [(a + b)(a + e) + (c + d)(b + d)] . 
^ 1 - [(a t b)(a + e) + (c + d)(b + d)] 

Koppa Coeffieient 

The koppa eoeffieient (MaeRae, 1970) is the agreement seore eorreeted 
for disagreement, and is given by: 

•* 

t - a + d - (b + e) . 
Phi Coefficient 

» The phi eoeffieient (MacRae, 1970) is a Pearson product moment correlation 
between binary variables, and 1s given by: 



ad - be 




»^ (a + e)(b + d)(a + b){e '+ d) • 

■ / 
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Phi/Phimax Coefficient - * 

The phi/phimax coefficient (Welsberg, 1968f is t1ie-phi coefficient 
divided by the maximum possible phi c^fficient that could be obtained frtfm 
a table with the -same marginais. Tht? procedure 'corrects the phi coefficient 
for item difficulty effects. The phi/phlmax coefficient is given by the 
foUovring formulae: <- f 

* 

1) if ad > be and b < c, 

A» - ad - be . 
^ " (b + d)(b + a; ' 

2) if ad > be and b > c, 

At - ad - be , 
^ ^ (e + d)(e + a) ' 

3) if ad < be and d > a, 

^ , ad - be . 

^ = (a + c)(a + b) ' 

and 4) if ad < be and d < a, ^ 

A' - guJ - be . 

Tetrachorlc Correlation * • 

The tetrachor4| correlation (MacRae, 1970) is* an estimate of th^ ' 
correlation between two continuous variables having a bivariate normal distri- 
bution. It has been assumed that the variables have been artifically dichotomized 
to produce the 2x2 table obtained for the two items. The tetrachoric correlation^ 
is^ approximated by: 

r, =sin (5.- J^ ). 

where the valu^ in the parenthesis is in radians. The tetrachoric correlations 
were corrected for guessing'from the 2x? table using the procedure set out by 
Carroll (1945). 

Yule's Q Coefficient 

Yule's Q (MaeRae, 1970) is a measure of the power of one variable to 

predict another, and 
1 



is gjK^n by:' 



V 
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. ^ ~ ad +■ 5c* 



Yule's Q is a, s^ec<al case of Goodman and KruskaTs gapia coeffic^Hsjat,"' 
Ik* •' , ' 

Yule's J Coefficient ^ * » ■ 

■ YuTe's Y (MaeRae,- 1976) is given by:. 

* > 



; ' • . /i? + /be 

It can be seen that r , Q, and Y*are transformations of each other. 

The remaining four coefficients are best described using Ihe foUowin-g 



table: 



Item j 
0 1 2 



ftem 1 1 
2 



I a 


b 


c 


a + b + c 
d + e + -f 
g + h + i 


d 


e 


f 


g 




J 

1 


a+d+g 


Ve+h 


c+f+y. 





wheV-e zero represents failing the item, 2 represents passing the item, and 1- 
represents a neutral or intermediate response. 

Gooclmarr and^uskal's Gamma Coefficient 



Goodman and KruskaTs gamma (Hays, 1953) is given by: 



\ - h 



) 



1 



where 



and' 



r 
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Sj = a. (e+f+h+1) + b(f+i) + d(h+ll "+ ei 

$2 = c (d+e+tci+h) + b(d+g) + ffg+h) + eg. 

83 ' " 
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The gamma^coeffictent was developed as a measure of association between 
ordinal variables. 

Kendall's tau B Coefficient j / 

Kendall's tau B (Hays, 1963) -Is given by: 

Where S, and Sp are as set out above, and 

S3- =" [Sj +'$2 + a(b+c) + be + d(e+f) +'ef + g(h+i) + hi] x 
^'[Sj + $2 + a(d+g) + dg + b(,e+h) + eh + c(f+i) + f1]. 

Lijphart's Index 

Lijphart's index (MaeRae, 1970) was developed as a measujifi of voter 
agreement, and is given by: 

1= ' 

where A is the agreement score. ^ 

Pearson's Correlation 

' This coefficient is the traditional product moment correlation coefficient, 
and is^ given by: - 

I r = N (a+1-c-.g) - (T1^T2)(T^4) 

^ /dTd? 



where ' 



and 
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n = g + h + i , 

T2 = a + b + 'c , 

T3 = c + f + i , 

T4 = a + g + d, ^ 

D'l => t<(Tl+T2) - (Tl-T2)'^, 



D2 = N(T3+T4) - (T3-t4)^ 



\ 



er|c " 



4^ 



Navy 



Navy 



Dr» Jack R, Bor4ti::S 
provost & Academic Dean 
U.S. Naval Postgraduate* School 
Monterey, CA 939^0 

Dr. Robert Breaux 
dode N-711 / ' 

NAVTRAEQUIPCEN 
Orlando, FL 32813 

Chief bT Naval Education and Training 

Liason Office 
Air Force Human Resource Laboratory 
"Flying Training Division 
WILLIAMS AFB, AZ 85224 



CD8' Hike Cur ran ^ 
Office of Naval Research 
800 N. Quinoy St. 
Code 270 

Arlington, VA 22217 



Dr» Richard Elster 

Departinenfe^.of Administrative Science^s 
Naval Wsltgraduate 'School 
Monterey, CA 939^0 

DR. PAT FEDERICO ./ 
WAVY PERSONNEL R4D .CENTER ' 
SAN DIEGO, CA 92152 

Mr. Paul Foley , 

Navy Personnel ^R&D Center 

San Diego. CA 92152 

Dr. John Ford 
Navy Personnel RiU Center 
San Diego, CA 921! 

Dr. Henry M. Halff 
Department of Psychology, C-009 
University of California at San Diego 
La Jolla. qA 92093 



ERLC 




Dr. Patrick R. Harrison 
Psychology Course Director 
LEADERSHIP & LAW DEPT. (7b) 
DIV. OF PROFESSIONAL DEVELOPMMENT 
U.S. NAVAL ACADEMY 
ANNAPOLIS, MD 21102 

CDR Charles Hutchins 
Naval Air Systems Command 
AIR-340F ' 
Navy Department 
Washington, DC 20361 

CDR Robert S. Kennedy ^ 
Head, Human^Rerformartce Sciences 
Naval Aerospace Medical Research L^b. 
Box 29^07 

New Orleans, LA 70189 

Dr. Jtorraan J. Kerr 

Chief of Naval Technical Training 

Navel -Air Station- Memphis (?5> — 

Millington, TN 38054 

Dr. William L. Maloy ^ 
/Principal Civilian Advisor for 

Education affd Training 
Naval Training Command, Code OOA 
^'ensacola, FL ,^508 




Dr. Kneale/Marfthali 

Scientific Advisor to DCNO(MPT) 

OP01T 

Washington DC 20370 

CAPT Richard L. Martin; USN 

Prospective Commanding Officer 

USS Carl Vinson (CVN.70> 

Newport News^ Shipbuilding an^! Drydock Co 

Newport News, VA 23607 

Dr. James Mc Bride 
Navy Personnel RiD Center 
*San Diego, CA #2152 ' 

Ted M. I. YelXen 

Technical Information Office, Code 201 
NAVY PERSONNEL RiD CENTER 
SAN DIEGO, CA 92152 

■85 



Havy 



1 Library, Code P201L 

Navy Personnel R&D Center V. 
San Diego, CA . 92152 . 

6 Commanding Officer 

Naval Resggrch Laboratory 
Code 2627"'" 
Washin^n, DC 20390 

t 

1 psychologist 
. OMR Branch Office 

Bldg Section-D 
^ 666 Summer Street ^ 
Boston; MA 02210 

1 Psychologist , ' 

ONR Branch Office * 
536 S. Clark Street 
Chicago, IL 60605 

1 Office of Naval Research ^ 
Code 137 

800 N. Quincy SStreet 
Arlington, VA 22217 . 

5 Persoonel 4 Training Research Progr^s 
(Code 1158) 
Office of Naval Research 
Arlington, VA 22217 

1 Psychologist 

ONR Branch Of fi^ce ^ ' 
1030 East Green Street 
Pasadena, CA 91 101 

Office of the ChieY of Naval Operations 
Research Development & Studies Branch 

(OP-115) 
Washington, DC 20350 

1 LT Frank Petho, MSC, USN (Ph.D) 

Selection and Training Research Division. 
Humaq^erformance Sciences Dept. 
^ NavalAerospace Medical Research Laborat 
Pensacola^ FL 32508 

1 Dr, Bernard Rimland (03B) - 
* Navy Personnel R&D Center 
San Diego, CA 92152 ' 



erJc- 



Navy 



1 Dr. Worth Scanland, Director 

Research, Development, Test & Evaluation 
_ N-5 

Naval Education and Training Command 
MAS, Pensacola, FL 32508 

1 Dr. Robert G. Smith ^ 

Office of Chief of Naval Operations 
' OP-987H 
Washington, DC 20350 

1 Dr, Alfred F. Smode 

Training Analysis & Evaluation Group 

(TAEG) 
Dept. of the Navy 
Orlando, FL 32813 

1 Dr. Richard Sorensen 

Navy Personnel R&D Center 
San Diego, CA 92152 

1 Dr. Ronald Weitzman 

Code 51 WZ - ' 

Department of Administrative Sciences 
U. S. Nava\ Postgraduate School 
Monterey, CA 93940 

1 U^. Robert Wisher 
Code 309 

Navy Personnel R&D Center 
San Diego, CA 92152 

1 DR. MARTIN F. WISKOFF 

NAYt PERSONNEL R& D CENTER 
SAN DIEGO. CA 92152 



86 



krmy 



Array 



Tecfhnical Director 

U. S. Army Research Institute for the 

Behavioral and Social Sciences 
5001 Eisenhower Avenue ^ 
Alexandria, VA 22333 

Dr. Hyron Fischl 

O.S. Army Research Institute for the 
Social and Behavioral Sciences 
5001 Eisenhower Avenue 
Alexandria, VA 22333 

. Dr. Dexter netcher 
O.S. Army Research Institute 
5001 Eisenhower Avenue 
Alex and ria,VA 22333 

Dr. Michael Kaplan 
U.S. ARMY RESEARCH INSTITUTE 
5001 EISENHOWER AVENOE 
ALEXANDRIA, VA 22333 

Dr. Milton S. Katz 
Training Technical' Area i ^ 
U.S. Army Research Institute 
5001 Eisenhower Avenue 
Alexandria, VA 22333 



/ 1 



Dr. Harold F. O'Neil, Jr 
Attn: PERI-OK 
Army Research Institute 
5001 Eisenhower Avenue 
y Al^exandria, VA 2^333 



Dr. Robert Sasmor - 

U. S'. Array Research Institute for the 

Behavioral and Social Sciences 
5001 Eisenhov/er Avenue 
Alexandria, VA 22333 

Commandant 

US Array Institute of Adrainistration 

Attn: Dry. Sherrill 

FT Benjamin Harrison, IN i|6256 

Dr. Frederick Steinheiser 

Dept. of Navy ^ 

Chief of Naval Operlftions 

OP-113 

Washington, DC 20350 
Dr. Joseph Ward 

U.S. Array Research Institute — 
5001 Eisenhower Avenue 
ex^dria, VA 22333 



DR. JAMES L. RANEY 
U.S. ARMY. RESEARCH INSTITUTE 
5001 EISENHOWER AVENUE 
'ALEXANDRIA, VA 22333 

Mr. Robert Ross 

O.S. Army Research Institute for the 

Social and Behavioral -Sciences 
5001 Eisenhower Avenue 
Alexandria, tA ^22333 



0» 



id 

ERIC 



8' 




s Lab 



Air Force 



Air Force Human Resq. 
AFHRL/HPD 
Brooks AFB, TXJ?|{235 

y 

Dr. Earl A. ,Alluisi 
HQ,' AFHRL (AFSC) 
Brooks AFB, TX 78235 



Research gnd Measurment Division 
Research Branch, AFMPC/MPCYPR 
Randolph AFB, TX 781 US 

Dr. Malcolm Ree 
AFHRL/HP 

Brooks AFB, TX 78235 

Dr. Marty Rockway 
Technical Director 
AFHRL(OT) 

Williams AFB, AZ 5822i| - 



Marines 



1 H. William Greenup 

Education Advisor (E031) / 
Education Center, MCDEC 
Quantico, VA 22131 

1 Director, Office of Manpower Utilization 
HQ,, Marine Corps (MPU) 
• BCB; Bldg. 2009 
Quantico, VA 2213*4 

1 Major Michael t. Patrow, USWC 
Headquarters, Marine Copps 
(Code MPI-20) 
^Washington, DC 20380 

1, DR. A.L. SLAFKOSKY • 

SCIENTIFIC ADVISOR (CODE RD-1 ) 

HQ, U.S. MARINE CORPS * ' - 

WASHINGTON, DC 20380 ' •. 



8S 



ERIC 



i 

CoastGuard 



Other DoD 



Mr. Thomas kTMarm 
U. S, Coast Guard Ins.titute 
P,. 0. Substation 18 . 
Oklahoma Cityj OK 73169 



r 



12 Defense '•Technical Information Center 
Cameron Station, Bldg 5 
Alexandria, VA 2231^ 
At^n: TC 

1 Dr, William Graham 
Testing Directorate 
HEPCOM/HEPCT-P 
Ft. Sheridan, IL 60037 

1 Military Assistant for Training and 

Personnel Technology 
Office of the Under Secretary of Defense 

for Research & Engineering 
Room 3D129, The Pentagon. 
Washington, DC 20301 

1 Dr. Waype Sellman 

Office of the Assistant Secretary 
of Defense (MRA & L) 
2B269 The Pentagon*" 
Washington^ DC 20301 

1 DARPA 

lilOO Wilson Blvd. 
Arlington. VA 2^209 



Civil Govt 



N6() Govt 



Dr. Andrew R, Molnar 
Science Education Dev. 

and Research 
National Science Foundation 
Washington, DC 20550 

Dr. Yern W. Urry 

Personnel R&D Center 

Office of Personnel Management 

1900 E Street fM 

Washington. DC 20415^ 

Dr. Joseph L. Young, Director 
Memory & Cognitive Processes 
National Science Foundation 
Washington, DC 20550 



Dr. 'Erling B. Tinder sen 
Department of Statistics 
Studies.traede 6 
1^55 Copenhag-en 
DENMARK 

1 psychological research unit 
Dept. of Defense (Army Office) 
'Campbell Park Offices 
Canberra ACT 2600, Aiistralia 

« 

Dr . ^Isaac Be jar 
Educational Testing Service 
Prlnarceton, NJ 08^50 

Capt, J, Jean Belanger-- 
Training Development Division 
Canadian Forces Training System 
CFTSHQ, CFB Trenton 
Astra, Ontario KOK 1B0 



CDR Robert J. Biersner 
Programjfanager _ 
Buman Performance 
Navy Medical RiD Command 
Betbesda, MD 2001M 

Dr. Menucha Birenbaum 

School of Education^ 

Tei Aviv University 

Tel Aviv, Raraat Aviv 69978 

Israel' * 

Dr. Werner Birke 

DezWPs im Streitkraefteamt 

Postfach 20 50 03 

D-5300 Bonn 2 

WEST GERMANY 



Liaison Scientists 
Office of Naval Reseai^ch, 
Branch Office , London 
Box 39 FPO Newjork 09510 

) 

Col Ray Bowles 
800 N. Quincy St. 
KJoffl 804 . 

Arlington, VA 22217 



ERIC 



.90 



Men Govt 



Hon Govt 



Dr. Robert Brennan 

American College Testing Programs. 

P, 0, Box 168 

Iowa City, lA '522U0 

DR. C. VICTOR BUNDERSON . 
WICAT INC; ^ . 

UNIVERSITY \klk, SUITE 10 
1160 SO. STATE ST. 
OREM. UT 8U057 

Dr. John B* Carroll 
Psychometric Lab 
Univ. of No. Carolina 
Davie Hall 01 3A 
Chap^ Hill, NC 2751 M 

Charges Myers Library 
LivAg stone House 
Livingstone Road 
Stratford 
London E15 2LJ 

ENGLAND ~- 

Dr. Kenneth ^ Clark 
College of Arts & Sciences 
University of Rochester 
River Campus Station 
Rochester , NY 1U627 

Dr. Gorman Cliff 
Dept. of Psychology 
Univ. of So. California 
University Park 
Los Angeles. CA 90007 

Drx^WiHiam E. Coffman 
Director, Iowa*- Testing Programs 
33^ Lindquist Center 
University of Iowa ^ ' ^ 
Iowa City, lA 522^12 

Dr. Meredith P. Crawford 
American Psychdlogical Association 
1200 17th Street. N.W. 
Washington, DC 20036 



Dr.. Fritz Drasgow 

Yale School of Organization and Managerae' 
Yale University 
Box 1A 

New Haven. CT 06520 ^ 

Dr. Havin D. Dunnette 

Personnel Decisions Research Institute 

2415 Foshay Tower 

821 Marguette Avenue 

Mineapolis, MN 55M02 . 

Hike Durmeyer 

Instructional Progran Development 
.Building 90 
^NET-PDCD 

Great Lakes MTC, IL ' 60088 

ERIC Facility-Acquisitions 
4833 Rugby Avenue 
Bethesda. MD 20014 

Dr. Benjamin A. Fairbank. Jr. 
McFann-Gray & Associates, Inc. 
5825 Callaghan 
Suite. 225 

San Antonio, Texas 78228 

Dr. Leonard Feldt 

Lindquist Center for Measurment 

Ikiiversity of Iowa 

Iowa City, lA 52242 

Dr. Richard L.^Fer'guson 

The AmericanSf College Testing Program 

P.O. Box 168 /' 

Iowa City, lA 522110 

Dr". Victor Fields 
Dept. of Psychology 
Hontgcoiery College 
Rockville, MD 20850 

Univ. Prof. Dr.- Gerhard Fischer 
Liebiggasse 5/3 
A 1010 Vienna 
AUSTRIA 



ERIC 



91 



Hon Govt 



Non Govt 



Professor tonald Fitzgerald 
University of New England 
Armidale, Mew South Wales 2351 
AUSTRALIA 

Dr. Edwin A. Fleishman 

Advanced Research ilesourcc s Organ. 

Suite 900 

4330 East West Highway . 
Washington, DC 20014 

Dr. John R. Frederlksen 
Bolt Beranek & Newman 
50 Moulton Street 
Cambridge, MA 02138 

DR. ROBERT GLASER 
LRDC * 

UNIVERSITY OF PITTSBURGH 
3939 O'HARA STREET 
PITTSBURGH, FA 15213 

Dr. Bert Green 
Johns Hopkins University 
Department of Psychology 
Charles & 34th Street 
Baltimore, HD 21218 

-5^ 



0 



Dr. Ron Hambleton 
School of Education 
University of ^assechusetts 
Amherst, MA 01002 



Dr. Chester Harris 
School of Education 
diversity of California 
Santa Barbara, CA 93106 

Dr. Uoyd Humphreys 
Department of Psychology 
University of Illinois 
CharaiJaign, IL 61820 

Library 

Hum RRO/ We stern Division 
27857 Berwick Drive 
Carmel, CA 93921 



Dr. -Steven Hunka 
Department of Education 
University of Alberta 
Edmonton , .Alberta 
CANADA 

Dr. Earl Hunt 
Dept. of Psy-^logy 
University of Washington 
Seattle., WA 98105 

Dr. Huynh Huynh 
Collegte of Education 
.University of South Carolina 
Colunbia, SC 29208 

Professor John A. Keats 
University of Newcastle 
AUSTRALIA 23O8 

Mr. Marlin Kroger 
1117 Via Goleta 

Palos Verdes Estates, CA*9027i| 
Dr. Michael Levine 

Department of Educational Psychology 
210 Education Bldg, 
University of Illinois 
Champaign, IL 61801 

Dr. Charles Lewis 

Faculteit Sociale Wetenschappen 

Rijksuniversiteit Groningen 

CXide Boteringestraat 23 

9712GC Groningen 

Netherlands , 

Dr. Robert Linn 
College of Education 
University of Illinois 
Urbana, IL 61801 

Dr. Frederick M. Lord 
Educational Testing Service 
Princeton, NJ 085^0 

Dr. Gary Marco 

Educational Testing Service 

Princeton, NJ 08^0 



ERIC 



92 



Non Govt 



Non G6v 



1 Df. Scott Maxwell 

Department of Psychology 
University of Houston 
Houston. TX 77004 

1 Dr* Samuel T. Mayo 

Loyola University of Chicago 
820 North Michigan Avenufe 
Chicago t XL 60611 

1 . Professor Jason Millman 
Department of Education 
Stone Hall 
Cornell University 
Ithaca, NY 14853 

1 Bill Nordbrock 

Instructional Program Development 
Building 9^0 
HET-PDCD ' 

Great Lakes ffTC, IL 60088 

1 Dr. Melvin R. Novick 

356 Lindquist Center for Hcasurment 

University of Iowa 

Iowa City, lA 52242 

1 Dr. Jesse Orlansky 

Institute for Defense Analyses 
HOO Army Navy Drive 
Arlington, VA 22202 

1 Dr. James A. Paulson 

Portland State University 
P.O. Box 751 
Portland, OR 97207 

1 MR. LUIGI PETRULLO 

2431 N. EDGEWOOD STREET 
ARLINGTON, VA 22207 

1 DR. DI^^M. RAMSEY-KLEE 

R-K RESEARCH & .SY3TEM DESIGN 
39U7 RIDGEMONT DRIVE 
HALIBU, CA 90265 



1 HINRAT M. L. RAUCH 
P II'4 

BUNDESMINISTERIUM DER VERTEIDIGUNG 

POSTFACH 1328 

D-53 BONN 1, GERMANY 

1 Dr. Mark D. Reckase 

Educatiotial Psychology Dept. 
University of Missour i-Colunbia 
4 Hill Hall 
Columbia, MO 65211 

1 Dr . Andrew M. Rose 

American Institutes for Research 
t 1055 Thomas Jefferson St. NW 
• Washington, DC 20007 

1 Dr. Leonard L. Rosenbaum, Chairman 
Department of Psychology 
Montgomery College 
Rockville, MD 20850 . 

1 Dr. Ernst Z. Rothkopf 
Bell Laboratories 
600 Mountain Avenue 
Hurray 'Hill, NJ 07974 

1 DrV Lawrence Rudner 
403 Elm Avenue 
Takoma Park, MD 20012 

1 Dr. J. Ryan 

Department of Education ^ 
■University of South Carolina 
Colunbia, SC 29208 

1 PROF. FUMIKO SAMEJIMA 
DEPT. OF PSYCHOLOGY 
* UNIVERSITY OF TENNESSEE 
KNOXVILLE, TN 37916 

1 DR. ROBERT J. SEIDEL 

INSTRUCTIONAL TECHNOLOGY ^GROUE 

HUMRRO 
300 N. WASHINGTON ST. 
ALEXANDRIA, VA 22344 



ERIC 



9, 



Hon Govt 



Non GovTT 



1 Dr. 



Dr. Kazuo Shigemasu 

University of Tohoku 

Department of Educational Psychology 

Kawauchi, Sendai 980 

JAPAN 

Dr. Edwin Shirkey 
Department of Psychology 
University of Cent.ra'. Florida 
lando, FL 328l6 



Robert Smith 
Department of Computer Science 
Rutgers Universit,y 
Kew Bruoswick, NJ O8903 

Dr. Richard Snow 
School of Education 
Stanford University 
Stanford, CA 9^305 

Dr, Robert Sternberg 
Dept/ of Psychology 
Yale University 
Box llA, Yale Station 
New Haven, CT 06520 



1 Dr. Kikumi Tatsuol/a 

Computer Based Edu^ationi Research 

Laboratory 
252 Engineering Research Laboratory 
University of Illinois 
Urbana, XL 6I8OI 

. 1 Dr. David Thissen 

Department of Psychology 
University of Kansas 
I^wrence, KS 660^^ 

1 Dr. Robert Tsutakawa 

Department of Statistics 
University of Missouri 
Columbia, MO 65201 

1 Dr. J. Uhlaner 

Perceptrbnics , Inc . 
6271 Variel Avenue 
Woodland Hills, CA 91 36^ 

1 Dr. Howard Wainer 

Division of Psychological Studies 
Educational Testying Service 
] Princeton, NJ 085MO 



DR. PATRICK SUPPES 

INSTITUTE FOR HATHEMATICAL STUDIES IN 

THE SXIAL SCIENCES 
STANFORD UNIVEi^SITY 
STANFORD, CA 9^305 



Dr . Phyllis Weaver 

Graduate School of Education 

Harvard University 

200 Larsen Hall, Appian Way 

Cambridge, MA 02138 



Dr. Hariharan Swaminathan 
Laboratory of Psychometric and 

Evaluation Research 
School of Education 
University of Massachusetts 
Amherst, MA 01 003 

Dr. Brad Sympson 
Psychometric Research Group 
Educatii^fcl Testing Service 
Prince t^/ NJ 085M1 



Dr. David J. Weiss 
R660 Elliott Hall 
University of Minnesota^ 
7^ E. River Road 
Minneapolis, MN 55^55 

DR. SUSAN E. WHITELY 
PSYCHOLOGY DEPARTMENT 
UNIVERSITY OF KANSAS 
LAWRENCE, KANSAS 660UU 

Volfgang Wildgrube 
Streitkraefteamt 
Box 20 50 03 
D-5300 Bonn 2 
WEST GERMANY 



ERLC 



9 4 



